Introduction
We tested all 1381 TRSs of TPDB 7.0 using AProVE with exactly those strategies that are described in the paper Lazy Abstraction for Size-Change Termination by Michael Codish, Carsten Fuhs, Jürgen Giesl, and Peter Schneider-Kamp. In this way, we repeated the experiments from that paper.
In detail, the strategies are using four different base orderings
- embedding ordering (EMB)
- lexicographic path ordering (LPO)
- recursive path ordering (RPO)
- polynomial ordering (POLO)
- the reduction pair processor (REDPAIR)
- the reduction pair processor using SCNP reduction pairs with three different heuristics (SCNPFAST, SCNPMAX, SCNPALL)
- size-change termination (SCT)
SCNPFAST | SCNPMAX | SCNPALL | REDPAIR | SCT | |
---|---|---|---|---|---|
EMB | SCNPFAST_EMB | SCNPMAX_EMB | SCNPALL_EMB | REDPAIR_EMB | SCT_EMB |
LPO | SCNPFAST_LPO | SCNPMAX_LPO | SCNPALL_LPO | REDPAIR_LPO | SCT_LPO |
RPO | SCNPFAST_RPO | SCNPMAX_RPO | SCNPALL_RPO | REDPAIR_RPO | SCT_RPO |
POLO | SCNPFAST_POLO | SCNPMAX_POLO | SCNPALL_POLO | REDPAIR_POLO | SCT_POLO |
Our experiments and the original experiments show only neglectable differences: on the 20 * 1381 = 27620 experiments, only 8 results differ due to timing issues.
The main purpose of our experiments was to certify the generated proofs. And indeed, in our experiments we detected that several proofs have been flawed as the usable rules have not been computed correctly for the proof output, e.g. in this proof for AG01/3.1 to compute division on natural numbers, the minus-rules are not usable. Note however, that only the proof output of AProVE was affected, since the search for reduction pairs uses a different routine to compute usable rules.
After the computation of usable rules has been corrected by Carsten Fuhs, indeed all proofs could be certified. Note that the use of the generalized multiset ordering was essential. In these proofs the use of the standard multiset ordering was insufficient to orient the constraints.
Concerning execution time one can further observe that certification of SCNP reduction pair proofs is faster than certification of size-change termination proofs: the average certification time for SCNP_* is 33 milliseconds, whereas it is 50 milliseconds for SCT_*.
All experiments have been performed on a machine with two 2.8 GHz Quad-Core Intel Xeon processors and 6 GB of main memory using a 60 seconds timeout. In the result table one can click on the execution times to see the proofs and answers of CeTA (version 2.2).
Summary
SCNPFAST_EMB | SCNPFAST_LPO | SCNPFAST_RPO | SCNPFAST_POLO | SCNPMAX_EMB | SCNPMAX_LPO | SCNPMAX_RPO | SCNPMAX_POLO | SCNPALL_EMB | SCNPALL_LPO | SCNPALL_RPO | SCNPALL_POLO | REDPAIR_EMB | REDPAIR_LPO | REDPAIR_RPO | REDPAIR_POLO | SCT_EMB | SCT_LPO | SCT_RPO | SCT_POLO | Total | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Examples | 1381 | 1381 | 1381 | 1381 | 1381 | 1381 | 1381 | 1381 | 1381 | 1381 | 1381 | 1381 | 1381 | 1381 | 1381 | 1381 | 1381 | 1381 | 1381 | 1381 | |
Successes | 346 (346 YES, 0 NO) | 500 (500 YES, 0 NO) | 501 (501 YES, 0 NO) | 477 (477 YES, 0 NO) | 346 (346 YES, 0 NO) | 530 (530 YES, 0 NO) | 530 (530 YES, 0 NO) | 515 (515 YES, 0 NO) | 347 (347 YES, 0 NO) | 529 (529 YES, 0 NO) | 531 (531 YES, 0 NO) | 516 (516 YES, 0 NO) | 325 (325 YES, 0 NO) | 505 (505 YES, 0 NO) | 527 (527 YES, 0 NO) | 511 (511 YES, 0 NO) | 341 (341 YES, 0 NO) | 385 (385 YES, 0 NO) | 385 (385 YES, 0 NO) | 378 (378 YES, 0 NO) | 573 (573 YES, 0 NO) |
Failures | 1035 (12 timeouts) | 881 (12 timeouts) | 880 (13 timeouts) | 904 (12 timeouts) | 1035 (12 timeouts) | 851 (21 timeouts) | 851 (25 timeouts) | 866 (14 timeouts) | 1034 (13 timeouts) | 852 (31 timeouts) | 850 (35 timeouts) | 865 (19 timeouts) | 1056 (12 timeouts) | 876 (16 timeouts) | 854 (18 timeouts) | 870 (12 timeouts) | 1040 (128 timeouts) | 996 (127 timeouts) | 996 (127 timeouts) | 1003 (128 timeouts) | |
Time (termination tool) | 0h 56m 58s | 1h 3m 49s | 1h 5m 53s | 1h 4m 56s | 0h 58m 22s | 1h 20m 40s | 1h 27m 38s | 1h 17m 15s | 1h 1m 49s | 1h 40m 20s | 1h 47m 50s | 1h 28m 22s | 0h 56m 51s | 1h 12m 50s | 1h 17m 43s | 1h 2m 54s | 2h 52m 1s | 2h 52m 9s | 2h 52m 48s | 2h 51m 56s | 31h 13m 15s |
Accepted | 346 | 500 | 501 | 477 | 346 | 530 | 530 | 515 | 347 | 529 | 531 | 516 | 325 | 505 | 527 | 511 | 341 | 385 | 385 | 378 | 573 |
Rejected | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
Unsupported | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
Time (certification) | 0h 0m 5s | 0h 0m 18s | 0h 0m 17s | 0h 0m 14s | 0h 0m 5s | 0h 0m 21s | 0h 0m 20s | 0h 0m 21s | 0h 0m 5s | 0h 0m 20s | 0h 0m 20s | 0h 0m 20s | 0h 0m 5s | 0h 0m 19s | 0h 0m 20s | 0h 0m 15s | 0h 0m 7s | 0h 0m 25s | 0h 0m 25s | 0h 0m 17s | 0h 5m 30s |
Color codes
Success (Proof) |
Success (Disproof) |
Failure |
Timeout |
Accept |
Reject |