Different usage contexts of mode indicators, and the '-' mode

jschimpf · January 22, 2025, 12:44am

Further to our recent discussions, I wanted to concretise a point I was trying to make. I can see 3 different contexts in which we want to use +/-/? mode indicators:

Descriptive mode alternatives: these are provided by the designer/implementor of a predicate; often there are several of them for one predicate, possibly with a determinism indicator; ISO uses these and requires them to be mutually exclusive. E.g.

    string_code(+,+,-) is det        “can be called like this...”
    string_code(-,+,?) is nondet     “...or like that (else error/undefined)”

Descriptive mode summary: also provided by the designer/implementor of a predicate; only one per predicate (which can ideally be obtained by generalising from the above alternatives); the purpose is to have a compact (although imprecise) one-line description for a predicate. E.g.

    string_code(?,+,?)               “anything else is error/undefined”

Mode promise: this would be provided by an application writer or by a code analyzer; one or multiple alternatives. E.g.

    string_code(-,+,-)                “I promise to only make calls like this”

All of these are in current use, for example

the ISO standard uses (1) only
Ciao declarations support (1), albeit without requiring mutual exclusivity
SWI and SICStus use (2) in their documentation
ECLiPSe uses (1) and (2) in its documentation, and (3) in its mode/1 declaration
…

The ‘-’ mode
To distinguish between these contexts is helpful when discussing the various meanings of the ‘-’ mode, which Manuel documented in his draft.

“The argument is/shall be a variable” (ISO, GP):
This simple meaning makes sense in context (1), because it can serve to distinguish between mode alternatives (often there will be an alternative mode where the corresponding argument has ‘+’ mode). On the other hand, in context (2) and (3) this meaning is rarely useful.
“An output argument. It may or may not be bound at call-time. If the argument is bound at call time, the goal behaves as if the argument were unbound, and then unified with that term after the goal succeeds” (SWI,ECL/doc):
This makes sense in context (2), where it describes the common case of a steadfast output argument. Such arguments would otherwise have to be described with ‘?’, and the steadfastness aspect would be lost. In contexts (1) and (3) this meaning makes no sense.
“A free variable without any constraints, especially it must not occur in any other argument and it cannot be a suspending (or constrained) variable”. (ECL/decl):
This very strict meaning is only useful as a promise in context (3), because it enables compiler optimizations such as improved indexing, reordering unifications, determinism detection etc.

Does this sound right? If yes, one would have to decide whether it is ok to have different meanings of ‘-’ depending on context, or whether one should use 3 different symbols…

pmoura · January 22, 2025, 7:22am

These are the definitions I use in Logtalk:

https://logtalk.org/manuals/userman/predicates.html#mode-directive

Mode indicators – and ++ can also be found in some Prolog systems.

herme · January 22, 2025, 8:41am

Thanks Paulo, I will add them to the PIP draft.

herme · January 22, 2025, 9:12am

Yes, these are all useful applications of modes, and I agree we want to use modes for all of those scenarios. Btw, in Ciao they are used for all of those purposes: documentation, verification, optimization, etc. (i.e., they are used for all of 1, 2, and 3). You can also write multiple modes for a predicate in any of the uses; hey this is LP right! :-). Of course one can always instead (or even also) provide a summary mode that LUBs all the different uses if one prefers so.

Btw, there are also other purposes for modes/assertions, such as guiding the analysis, expressing analysis results, defining external interfaces, etc., etc. but these are more advanced and I think they will easily be covered by whatever we decide.

Does this sound right? If yes, one would have to decide whether it
is ok to have different meanings of ‘-’ depending on context, or
whether one should use 3 different symbols…

Using the same mode symbol with different meanings depending on context can be attractive in a way, but I think it would be a source of problems/confusion. I think it would be much clearer to have different symbols when their meanings are different. Apart from clarity, this has the advantage that the modes information can be ‘taken seriously’ by the different parts of the system when convenient.

JanWielemaker · January 22, 2025, 10:24am

I think these messages perfectly describe the status. Thanks. I still have trouble to make up my mind where this should go though

As a side note, as I already mentioned in the last discussion, SWI-Prolog added a real implementation of mode/1 as found in various Prolog systems. Before, it merely had an auto-loaded dummy implementation to accept code having these declarations. The supported modes are (for now) -, + and ? and the only implication is that - causes the JIT clause indexing to never consider this argument for creating an index. Any value is accepted for the argument and the semantics is unaffected. That may all change, depending on where this ends For one thing, I’m not entirely happy about this as declaring an argument as - to affect indexing while still allowing the argument to be instantiated can make sense if you want to avoid creating the index if you know it is called only once with this argument instantiated. In that case the index creation probably merely wastes time and memory.

herme · January 22, 2025, 11:28am

Btw, as a reminder to readers, the current status of the PIP is maintained at PIP 0103 - Modes and their meaning.

jschimpf · January 22, 2025, 12:43pm

OK, then to stimulate discussion:

-
indicates a steadfast output argument. It may or may not be bound at call-time. If the argument is bound at call time, the goal behaves as if the argument were unbound, and then unified with that term after the goal succeeds. No errors are raised.
Precedents: current use in SWI, SICStus, ECLiPSe, Logtalk, …

--
indicates that the argument should be unbound at call time. If not, then additional documentation may specify that an error is raised, otherwise the behaviour is undefined. Precedents: current use in SWI, Logtalk, …

---
indicates a free variable, not instantiated, not aliased, not constrained, i.e. it can be unified with anything without risk of failure. For mode promises and analysis results.

----
like --- but the argument is also the first occurrence of this variable. This can be used by a compiler to chose a unification-free value-return mechanism that does not require the creation of a result variable at all. Would probably just be used to convey analysis results.

A challenge would be to see which of these can be described using Ciao’s formalism.

Caveat: these characterisations only talk about the situation at call time. Since this is already complex enough, I’d suggest that additional semantics like “argument will be instantiated at exit time” or “argument will not be further instantiated” are better handled by an orthogonal mechanism.

herme · January 23, 2025, 2:44pm

A challenge would be to see which of these can be described using Ciao’s formalism.

Actually, before getting into this, let me first formulate some modes that are less informal, to better illustrate Ciao’s formalism:

For example for the the ISO modes (isomodes package in Ciao) we have:

+ : the argument should be nonvar/1 on calls.
  :- modedef +(A) : nonvar(A).
? : no conditions on the argument.
  :- modedef ?(_).
@ : the argument is not further instantiated on success.
  :- modedef @(A) : nonvar(A) + not_further_inst(A).
- : the argument is var/1on calls and nonvar/1 on success.
  :- modedef -(A) : var(A) => nonvar(A).

As another example, this is part of the Ciao ‘modes’ package, which is
similar to current SWI, SICStus, ECLiPSe, Logtalk, etc.:

+ : the argument is nonvar/1 on calls.
  :- modedef +(A) : nonvar(A).
- : the argument is nonvar/1 on success.
  :- modedef  -(A) => nonvar(A).
? : no conditions on the argument.
  :- modedef ?(_).
@ : the argument is not further instantiated on success.
  :- modedef @(A) : nonvar(A) + not_further_inst(A).

And a few other examples of potentially useful input-output modes (Mercury-style):

in : the argument is ground/1 on calls and on success.
 :- modedef in(A)  : ground(A) => ground(A).
++ : alias for in.
 :- modedef in(A)  : ground(A) => ground(A).
out : the argument is var/1 on calls and ground/1 on success.
 :- modedef out(A) : var(A)    => ground(A).
go : the argument is ground/1 on success.
 :- modedef go(A)              => ground(A). 

Now, regarding the ones you mention Joachim, some first thoughts:

---
indicates a free variable, not instantiated, not aliased, not
constrained, i.e. it can be unified with anything without risk of
failure. For mode promises and analysis results.

This one is easy:

   :- modedef  ---(A) : ivar(A).

i.e., it is an independent variable on calls. ivar/1 is a property
in Ciao that means exactly that, var/1, plus indep/2 from other
vars. This one can often be checked at compile time and otherwise at
run-time.

----
like --- but the argument is also the first occurrence of this
variable. This can be used by a compiler to chose a unification-free
value-return mechanism that does not require the creation of a
result variable at all. Would probably just be used to convey
analysis results.

For this we would add ‘first_occurrence/1’, as a property, and combine
both:

 :- prop first_occurrence(X) # "`X` is a first occurrence."
 :- modedef  ---(A) : (ivar(A),first_occurrence(X)).

‘first_occurrence/1’ is detected always by the compiler, so it can
always be verified or marked as error already at compile time.

The other modes, at least as described, are really more of the
‘documentation type’ that we discussed and should probably be treated
as such:

--
indicates that the argument should be unbound at call time.
If not, then additional documentation may specify that an error is
raised, otherwise the behaviour is undefined.

The first part would be easy:

:- modedef  --(A) : var(A).

but it implies that an error must be raised if the argument is
nonvar. We do not currently use modedef to express several cases
(it could be done, but we would use instead several assertions using different modes).
Also, the part on “additional documentation may specify that an error
is raised” seems more a comment in a generic description of a mode
than something that you want the mode to mean when you use it?
The actual content would be in the documentation (or modes or
assertions) of the predicate. I.e., if a particular predicate wants to
say that an error should be raised then it can use the corresponding
mode (--- in this list) in its documentation, code, etc.

In any case, if we do want to define this one as is using modedef
then we can define an (uncheckable) property:

 :- prop maybe_var(X) # "`X` should be unbound at call time. 
    If not, then additional documentation may specify that an error is
    raised, otherwise the behaviour is undefined."

 :- modedef  --(A) : maybe_var(A).

or even skip the definition of the property, and add a comment to an
empty modedef.

For the final case:

-
indicates a steadfast output argument. It may or may not be bound
at call-time. If the argument is bound at call time, the goal
behaves as if the argument were unbound, and then unified with that
term after the goal succeeds. No errors are raised.

Since no errors are to be raised then it is again just documentation.
We can simply define a (computational) property:

:- prop steadfast(G,X)
# "Argument `X` is a steadfast output for goal `G`. It may or may not be bound
   at call-time. If the argument is bound at call time, the goal
   behaves as if the argument were unbound, and then unified with that
   term after the goal succeeds. No errors are raised.".

We can always then define analyses or run-time tests for the
properties if needed.

Caveat: these characterisations only talk about the situation at call
time. Since this is already complex enough, I’d suggest that
additional semantics like “argument will be instantiated at exit time”
or “argument will not be further instantiated” are better handled by
an orthogonal mechanism.

I put some examples above of modes that also talk about the output
(e.g., - in the standard).

Lots of material for our discussions! (And we haven’t even started to
deal with types.)`

jfmc · January 26, 2025, 12:50am

Hi all,
Some thoughts on ‘-’ as “documentation output” mode. According to recent discussions (please Jan correct me if I’m wrong) it seems to be some conflict w.r.t “output” as some observable property (using instantiation checks) vs “intended output”, as in read/1 as documented in SWI (read(-Term)).

Let us consider those predicates:

p1(X) :- X = _.
p2(X) :- Y = f(_), arg(1,Y,X).

Should we consider any of those cases as examples of predicates with output arguments? Operationally, the engine is doing some work and binding “X” to a new variable. But a smart compiler may turn that into simply p1() or p2(). Then read/1 when fed with the “_.” input:

  ?- read(X).
  _.

is indistinguishable from read(_).. We could even implement a smart read/1 predicate that does not touch X at all when what we read is a variable:

  my_read(X) :- read(Y), ( var(Y) -> true ; X = Y ).

Maybe the case for read/1 is closer to Joachim’s --- mode? I mean, clearly it is useful to know what the argument of read/1 comes from some internal computation or IO operation, but there is no way in Prolog to “observe” that fact (using instantiation check predicates). Ciao has (internally) something equivalent for some builtins. It is used as a calling convention so is “stronger” than a mode. Those output args are “returned” from the predicate (like in a C function). That implies that those can never be “input” predicates (some predicates could never be compiled with that convention), and they can be just fresh variables on success. So it is an optimization that is not observable from Prolog itself (although measurable in terms of CPU operations, etc.)

If some effect cannot be observed or checked, then why is it needed? I still feel that this information is useful, but maybe just as a hint for the optimizer, or to give the programmer an idea of the execution cost, memory consumption. The doc-output mode looks very natural when looking at Prolog as a glorified C with unification and backtracking but it is really hard to understand from the declarative side of Prolog (at least for me!).

I think that modes are really hard because operationally Prolog is much harder than functional or imperative programming. One proposal would be to introduce modes in different layers:

fully “observable” modes, observable with instantiation checks
other modes that observable from Prolog (not necessarily efficiently, like steadfastness)
other mode-related declarations that are related with calling conventions, non-functional properties, memory consumption, etc. (like —)

pmoura · January 26, 2025, 2:28pm

Note that the ISO Prolog Core standard correctly uses a read_term(@stream_or_alias, ?term, +read_options_llist) mode specification (read/1 is a bootstrapped built-in predicate, thus implying read(?term).

JanWielemaker · January 26, 2025, 3:16pm

I’ve heart the accusation more often Yes, I think you are right. However, most of the built-ins fall in this category. In fact, in most of our systems they are precisely that as they are implemented in C. In SWI-Prolog, - in the docs means

The argument does not affect the computation. I.e., the computation examines (may examine) the other arguments and/or the environment (e.g., read/1) to produce a term (which may be a variable) and unifies the - argument to this term.

I think that qualifies as a precise definition. It is even “observable” if we can observe the memory access that happens in the implementation.

SWI-Prolog’s + is defined similarly as an argument that does affect the computation. So,

 get_attr(+Var, +Module, -Value)

Here Var is normally unbound and possibly has a constraint. In ISO modes this should be ? and as Value may be unbound, the mode would be get_attr(?Var, +Module, ?Value), which fails to convey how this predicate should be used. The ISO mode is fine for verification, but poor for documentation.

“Pure” Prolog code using the definition of Markus Triska only has ? arguments. The Prolog code you find in the wild is typically somewhere in between because limitations are enforced by called built-ins, some modes lead to non-termination, etc. I still have problems with this. Using “summary” modes, we have atom_codes(?Atom, ?Codes), which actually refers to two modes (+,-) and (-,+), in ISO really (+,? and ?,+). On the other hand, we have sub_atom(+Atom, ?Before, ?Len, ?After, ?Sub) that is, as long as the first argument is bound, completely “pure”. Sometimes I think we should have a notation that tells us that a specific subset of the arguments is completely “pure”.

Overall, I think the “solution” we are looking for should

Separate documentation from other purposes
Introduce new notations for input/output. Possibly > and < can help? For rendered docs we can use some Unicode symbol, but we also want something ASCII for source files.
For almost all predicates, ISO + maps to input

herme · February 6, 2025, 2:30pm

I added some more material for discussion in the modes PIP, gathering discussions so far and including an area to reflect the evolving proposal.