A User’s Guide to Picat
Version 3.5
Neng-Fa Zhou and Jonathan Fruhman
Copyright ©picat-lang.org, 2013-2023.
Last updated July 1, 2023
Picat is a general-purpose language that incorporates features from logic programming, functional programming, constraint programming, and scripting languages. The letters in the name summarize Picat’s features:
The support of unification, non-determinism, tabling, and constraints makes Picat more suitable than functional and scripting languages for symbolic computations. Picat is more convenient than Prolog for scripting and modeling. With arrays, loops, and comprehensions, it is not rare to find problems for which Picat requires an order of magnitude fewer lines of code to describe than Prolog. Picat is more scalable than Prolog. The use of pattern-matching rather than unification facilitates indexing of rules. Picat is not as powerful as Prolog for metaprogramming and it’s impossible to write a meta-interpreter for Picat in Picat itself. Nevertheless, this weakness can be remedied with library modules for implementing domain-specific languages.
The Picat implementation is based on the B-Prolog engine. The current implementation is ready for many kinds of applications. It also serves as a foundation for new additions. The project is open, and you are welcome to join as a developer, a sponsor, a user, or a reviewer. Please contact picat@picat-lang.org and join the news group picat-lang@googlegroups.com.
The copyright of Picat is owned by picat-lang.org. Picat is provided, free of charge, for any purposes, including commercial ones. The C source files of Picat are covered by the Mozilla Public License, v. 2.0 (http://mozilla.org/MPL/2.0/). In essence, anyone is allowed to build works, including proprietary ones, based on Picat, as long as the Source Code Form is retained. The copyright holders, developers, and distributors will not be held liable for any direct or indirect damages.
The initial design of Picat was published in December 2012, and the first alpha version was released in May 2013. Many people have contributed to the project by reviewing the ideas, the design, the implementation, and/or the documentation, including Roman Barták, Nikhil Barthwal, Mike Bionchik, Lei Chen, Veronica Dahl, Claudio Cesar de Sá, Agostino Dovier, Sergii Dymchenko, Julio Di Egidio, Christian Theil Have, Håkan Kjellerstrand, Annie Liu, Nuno Lopes, Marcio Minicz, Richard O’Keefe, Lorenz Schiffmann, Paul Tarau, and Jan Wielemaker. Special thanks to Håkan Kjellerstrand, who has been programming in Picat and blogging about Picat since May 2013. The system wouldn’t have matured so quickly without Håkan’s hundreds of programs (http://hakank.org/). Thanks also go to Bo Yuan (Bobby) Zhou, who designed the picat-lang.org web page, Sanders Hernandez, who implemented the interface to the FANN neural network library, and Domingo Alvarez Duarte, who ported Picat to MinGW (https://github.com/mingodad/picat). The Picat project was supported in part by the NSF under grant numbers CCF1018006 and CCF1618046.
The Picat implementation is based on the B-Prolog engine. It uses the following public domain modules: token.c by Richard O’Keefe; getline.c by Chris Thewalt; bigint.c by Matt McCutchen; Espresso (by Berkeley); FANN by Steffen Nissen. In addition, Picat also provides interfaces to the SAT solver Kissat (https://github.com/arminbiere/kissat), Gurobi by Gurobi Optimization, Inc, CBC by John Forrest, GLPK by Andrew Makhorin, and Z3 by Microsoft.
Before we give an overview of the Picat language, let us briefly describe how to use the Picat system. The Picat system provides an interactive programming environment for users to load, debug, and execute programs. Users can start the Picat interpreter with the OS command picat.
OSPrompt picat |
Once the interpreter is started, users can type a command line after the prompt Picat>. The help command shows the usages of commands, and the halt command terminates the Picat interpreter. Users can also use the picat command to run a program directly as follows:
OSPrompt picat File Arg1 Arg2 … Argn |
where File (with or without the extension .pi) is the main file name of the program. The program must define a predicate named main/0 or main/1.1 If the command line contains arguments after the file name, then main/1 is executed. Otherwise, if the file name is not followed by any arguments, then main/0 is executed. When main/1 executed, all of the arguments after the file name are passed to the predicate as a list of strings.
Picat is a dynamically-typed language, in which type checking occurs at runtime. A variable in Picat is a value holder. A variable name is an identifier that begins with a capital letter or the underscore. An attributed variable is a variable that has a map of attribute-value pairs attached to it. A variable is free until it is bound to a value. A value in Picat can be primitive or compound.
A primitive value can be an integer, a real number, or an atom. A character can be represented as a single-character atom. An atom name is an identifier that begins with a lower-case letter or a single-quoted sequence of characters.
A compound value can be a list in the form [t1,…,tn] or a structure in the form $s(t1,…,tn) where s stands for a structure name, n is called the arity of the structure, and each ti (1 ≤ i ≤ n) is a term which is a variable or a value. The preceding dollar symbol is used to distinguish a structure from a function call. Strings, arrays, maps, sets, and heaps are special compound values. A string is a list of single-character atoms. An array takes the form {t1,…,tn}, which is a special structure with the name ’{}’. A map is a hash-table represented as a structure that contains a set of key-value pairs. A set is a special map where only keys are used. A heap is a complete binary tree represented as an array. A heap can be a min-heap or a max-heap.
The function new_struct(Name,IntOrList) returns a structure. The function new_map(S) returns a map that initially contains the pairs in list S, where each pair has the form Key = V al. The function new_set(S) returns a map set that initially contains the elements in list S. A map-set is a map in which every key is mapped to the atom not_a_value. The function new_array(I1,I2,…,In) returns an n-dimensional array, where each Ii is an integer expression specifying the size of a dimension. An n-dimensional array is a one-dimensional array where the arguments are (n-1)-dimensional arrays.
Picat allows function calls in arguments. For this reason, it requires structures to be preceded with a dollar symbol in order for them to be treated as data. Without the dollar symbol, the command S=point(1.0,2.0) would call the function point(1.0,2.0) and bind S to its return value. In order to ensure safe interpretation of meta-terms in higher-order calls, Picat forbids the creation of terms that contain structures with the name ’.’, index notations, array comprehensions, list comprehensions, and loops.
For each type, Picat provides a set of built-in functions and predicates. The index notation X[I], where X references a compound value and I is an integer expression or a range in the form l..u, is a special function that returns a single component (when I is an integer) or a list of components (when I is a range) of X. The index of the first element of a list or a structure is 1. In order to facilitate type checking at compile time, Picat does not overload arithmetic operators for other purposes, and requires an index expression to be an integer or an integer range.
A list comprehension, which takes the following form, is a special functional notation for creating lists:
[T : E1 in D1, Cond1, …, En in Dn, Condn] |
where T is an expression, each Ei is an iterating pattern, each Di is an expression that gives a compound value, and the optional conditions Cond1,…,Condn are callable terms. This list comprehension means that for every tuple of values E1 ∈ D1, …, En ∈ Dn, if the conditions are true, then the value of T is added into the list.
An array comprehension takes the following form:
{T : E1 in D1, Cond1, …, En in Dn, Condn} |
It is the same as:
to_array([T : E1 in D1, Cond1, …, En in Dn, Condn]) |
The predicate put(Map,Key,V al) attaches the key-value pair Key=V al to the map Map, where Key is a non-variable term, and V al is any term. The function get(Map,Key) returns V al of the key-value pair Key=V al attached to Map. The predicate has_key(Map,Key) returns true iff Map contains a pair with the given key.
An attributed variable has a map attached to it. The predicate put_attr(X,Key,V al) attaches the key-value pair Key=V al to X. The function get_attr(X,Key) returns V al of the key-value pair Key=V al attached to X.
Picat also allows OOP notations for calling predicates and functions. The notation A1.f(A2,…,Ak) is the same as f(A1,A2,…,Ak), unless A1 is an atom, in which case A1 must be a module qualifier for f. The notation A.f, where f is an atom, is the same as the call A.f().
A predicate call either succeeds or fails, unless an exception occurs. A predicate call can return multiple answers through backtracking. The built-in predicate true always succeeds, and the built-in predicate fail (or false) always fails. A goal is made from predicate calls and statements, including conjunction (A, B), disjunction (A;B), negation (not A), if-then-else, foreach loops, and while loops.
A predicate is defined with pattern-matching rules. Picat has two types of pattern-matching rules: the non-backtrackable rule Head,Cond => Body, and the backtrackable rule Head,Cond ?=> Body. The Head takes the form p(t1,…,tn), where p is called the predicate name, and n is called the arity. When n = 0, the parentheses can be omitted. The condition Cond, which is an optional goal, specifies a condition under which the rule is applicable. For a call C, if C matches Head and Cond succeeds, meaning that the condition evaluates to true, the rule is said to be applicable to C. When applying a rule to call C, Picat rewrites C into Body. If the used rule is non-backtrackable, then the rewriting is a commitment, and the program can never backtrack to C. If the used rule is backtrackable, however, the program will backtrack to C once Body fails, meaning that Body will be rewritten back to C, and the next applicable rule will be tried on C.
A call matches the head fib(0,F) if the first argument is 0. The second argument can be anything. For example, for the call fib(0,2), the first rule is applied, since fib(0, 2) matches its head. However, when the body is executed, the call 2=1 fails.
The predicate fib/2 can also be defined using if-then-else as follows:
An if statement takes the form if Cond then Goal1 else Goal2 end.2 The then part can contain one or more elseif clauses. The else part can be omitted. In that case the else part is assumed to be else true. The built-in throw E throws term E as an exception.
The pattern [Y|_] matches any list. The backtrackable rule makes a call nondeterministic, and the predicate can be used to retrieve elements from a list one at a time through backtracking.
After Picat returns an answer, users can type a semicolon immediately after the answer to ask for the next answer. If users only want one answer to be returned from a call, they can use once Call to stop backtracking.
The version of member that checks if a term occurs in a list can be defined as follows:
The first rule is applicable to a call if the second argument is a list and the first argument of the call is identical to the first element of the list.
Picat supports Prolog-style Horn clauses.3 A Horn clause takes the form Head :- Body. When Body is true, the clause can be written as Head. Let Head be p(t1,…,tn). The Horn clause can be translated equivalently to the following pattern-matching rule:
p(V 1 , … , V n ) ?=> V 1 = $t1, …, V n = $tn, Body. |
where the dollar symbols indicate that all of the ti’s are terms, rather than function calls.
A predicate definition that consists of Horn clauses can be preceded by an index declaration in the form
index (M11 ,…,M1n) … (Mm1,…,Mmn) |
where each Mij is either + (meaning indexed) or - (meaning not indexed). When a predicate defined by Horn clauses is not preceded by an index declaration, Picat automatically generates one for the predicate. For each index pattern (Mi1,…,Min), the compiler generates a version of the predicate that indexes all of the + arguments. An index declaration only affects the efficiency, but not the behavior of the predicate. It is assumed that there is a default version of the predicate that has none of its arguments indexed.
For a predicate of Horn clauses, a matching version of the predicate is selected for a call. If no matching version is available, Picat uses the default version. For example, for the call edge(X,Y), if both X and Y are free, then the default version is used.
A function call always succeeds with a return value if no exception occurs. Functions are defined with non-backtrackable rules in which the head is an equation F=X, where F is the function pattern in the form f(t1 ,…,tn) and X holds the return value. When n = 0, the parentheses can be omitted.
A function call never fails and never succeeds more than once. For function calls such as fib(-1) or fib(X), Picat raises an exception.
Picat allows inclusion of function facts in the form f(t1,…,tn)=Exp in function definitions.
Function facts are automatically indexed on all of the input arguments, and hence no index declaration is necessary. Note that while a predicate call with no argument does not need parentheses, a function call with no argument must be followed with parentheses, unless the function is module-quantified, as in math.pi.
The fib function can also be defined as follows:
The conditional expression returns 1 if the condition (N=0;N=1) is true, and the value of
fib(N-1)+fib(N-2) if the condition is false.
Picat allows assignments in rule bodies. An assignment takes the form LHS:=RHS, where LHS is either a variable or an access of a compound value in the form X[…]. When LHS is an access in the form X[I], the component of X indexed I is updated. This update is undone if execution backtracks over this assignment.
In order to handle assignments, Picat creates new variables at compile time. In the above example, at compile time, Picat creates a new variable, say X1, to hold the value of X after the assignment X:=X+1. Picat replaces X by X1 on the LHS of the assignment. It also replaces all of the occurrences of X to the right of the assignment by X1. When encountering X1:=X1+2, Picat creates another new variable, say X2, to hold the value of X1 after the assignment, and replaces the remaining occurrences of X1 by X2. When write(X2) is executed, the value held in X2, which is 3, is printed. This means that the compiler rewrites the above example as follows:
Picat supports foreach and while statements for programming repetitions. A foreach statement takes the form
foreach (E1 in D1, Cond1, …, En in Dn, Condn) Goal end |
where each iterator, Ei in Di, can be followed by an optional condition Condi. Within each iterator, Ei is an iterating pattern, and Di is an expression that gives a compound value. The foreach statement means that Goal is executed for every possible combination of values E1 ∈ D1, …, En ∈ Dn that satisfies the conditions Cond1, …, Condn .4
A while statement takes the form
while (Cond) Goal end |
It repeatedly executes Goal as long as Cond succeeds. A variant of the while loop in the form of
do Goal while (Cond) |
executes Goal one time before testing Cond.
A loop statement forms a name scope. Variables that occur only in a loop, but do not occur before the loop in the outer scope, are local to each iteration of the loop. For example, in the following rule:
the variables I and E are local, and each iteration of the loop has its own values for these variables.
The function read_list reads a sequence of integers into a list, terminating when 0 is read. The loop corresponds to the following sequence of recurrences:
L=[] L1 =[e1 |L] L2=[e2|L1] … Ln=[en|Ln-1] List=Ln |
Note that the list of integers is in reversed order. If users want a list in the same order as the input, then the following loop can be used:
This loop corresponds to the following sequence of recurrences:
L=[e1 |L1 ] L1=[e2|L2] … Ln-1=[en|Ln] Ln=[] |
Loop statements are compiled into tail-recursive predicates. For example, the second read_list function given above is compiled into:
A list comprehension is first compiled into a foreach loop, and then the loop is compiled into a call to a generated tail-recursive predicate. For example, the list comprehension
is compiled into the following loop:
A predicate defines a relation where the set of facts is implicitly generated by the rules. The process of generating the facts may never end and/or may contain a lot of redundancy. Tabling can prevent infinite loops and redundancy by memorizing calls and their answers. In order to have all calls and answers of a predicate or function tabled, users just need to add the keyword table before the first rule.
When not tabled, the function call fib(N) takes exponential time in N. When tabled, however, it takes only linear time.
Users can also give table modes to instruct the system on what answers to table. Mode-directed tabling is especially useful for dynamic programming problems. In mode-directed tabling, a plus-sign (+) indicates input, a minus-sign (-) indicates output, max indicates that the corresponding variable should be maximized, min indicates that the corresponding variable should be minimized, and nt indicates that the corresponding argument is not tabled.5
For a call edit(L1,L2,D), where L1 and L2 are given lists and D is a variable, the rules can generate all facts, each of which contains a different editing distance between the two lists. The table mode table(+,+,min) tells the system to keep a fact with the minimal editing distance.
A tabled predicate can be preceded by both a table declaration and at most one index declaration if it contains facts. The order of these declarations is not important.
A module is a source file with the extension .pi. A module begins with a module name declaration and optional import declarations. A module declaration has the form:
module Name. |
where Name must be the same as the main file name. A file that does not begin with a module declaration is assumed to belong to the global module, and all of the predicates and functions that are defined in such a file are visible to all modules as well as the top-level of the interpreter.
An import declaration takes the form:
import Name1, …, Namen. |
where each Namei is a module name. When a module is imported, all of its public predicates and functions will be visible to the importing module. A public predicate or function in a module can also be accessed by preceding it with a module qualifier, as in m.p(), but the module still must be imported.
Atoms and structure names do not belong to any module, and are globally visible. In a module, predicates and functions are assumed to be visible both inside and outside of the module, unless their definitions are preceded by the keyword private.
The predicate sum_aux is private, and is never visible outside of the module. The following shows a session that uses these modules.
The command load(File) loads a module file into the system. If the file has not been compiled, then the load command compiles the file before loading it. If this module is dependent on other modules, then the other modules are loaded automatically if they are not yet in the system.6 When a module is loaded, all of its public predicates and functions become visible to the interpreter.
The Picat module system is static, meaning that the binding of normal calls to their definitions takes place at compile time. For higher-order calls, however, Picat may need to search for their definitions at runtime. Several built-in modules are imported by default, including basic, io, math, and sys. For a normal call that is not higher-order in a module, the Picat compiler searches modules for a definition in the following order:
Picat can be used as a modeling and solving language for constraint satisfaction and optimization problems. A constraint program normally poses a problem in three steps: (1) generate variables; (2) generate constraints over the variables; and (3) call solve to find a valuation for the variables that satisfies the constraints, and possibly optimizes an objective function. Picat provides four solver modules, including cp, sat, smt, and mip.
In arithmetic constraints, expressions are treated as data, and it is unnecessary to enclose them with dollar-signs.
The loops provided by Picat facilitate modeling of many constraint satisfaction and optimization problems. The following program solves a Sudoku puzzle:
Recall that variables that occur within a loop, and do not occur before the loop in the outer scope, are local to each iteration of the loop. For example, in the third foreach statement of the sudoku predicate, the variables Row, Col, and Square are local, and each iteration of the loop has its own values for these variables.
An exception is an event that occurs during the execution of a program that requires a special treatment. In Picat, an exception is just a term. Example exceptions thrown by the system include divide_by_zero, file_not_found, number_expected, interrupt, and out_of_range. The exception interrupt(keyboard) is raised when ctrl-c is typed during a program’s execution. The built-in predicate throw Exception throws Exception.
All exceptions, including those raised by built-ins and interruptions, can be caught by catchers. A catcher is a call in the form: catch(Goal,Exception,Handler) which is equivalent to Goal, except when an exception is raised during the execution of Goal that unifies Exception. When such an exception is raised, all of the bindings that have been performed on variables in Goal will be undone, and Handler will be executed to handle the exception.
The call call_cleanup(Goal,Cleanup) is equivalent to call(Call), except that Cleanup is called when Goal succeeds determinately (i.e., with no remaining choice point), when Goal fails, or when Goal raises an exception.
A predicate or function is said to be higher-order if it takes calls as arguments. The built-ins call, apply, and find_all are higher-order. The predicate call(S,Arg1,…,Argn), where S is an atom or a structure, calls the predicate named by S with the arguments that are specified in S together with extra arguments Arg1,…,Argn. The function apply(S,Arg1,…,Argn) is similar to call, except that apply returns a value. The function findall(Template,S) returns a list of all possible solutions of call(S) in the form of Template. Other higher-order predicates include ' +/1, call_cleanup/2, catch/3, count_all, freeze/2, maxof/2-3, maxof_inc/2-3, minof/2-3, minof_inc/2-3, not/1, once/1, time/1, time2/1, and time_out/3. All of these higher-order predicates are defined in the basic module, except for time/1, time2/1, and time_out/3, which are defined in the sys module. Higher-order calls cannot contain assignments or loops.
Among the higher-order built-ins, findall is special in that it forms a name scope like a loop. Local variables that occur in a findall call are not visible to subsequent calls in the body or query.
The meta-call apply never returns a partially evaluated function. If the number of arguments does not match the required number, then it throws an exception.
A call that is passed to apply is assumed to invoke a definition in a pre-imported built-in module, the enclosing module in which apply occurs, an imported module of the enclosing module, or the global module. Due to the overhead of runtime search, the use of higher-order calls is discouraged. Whenever possible, recursion, loops, or list and array comprehensions should be used instead.
Picat provides action rules for describing event-driven actors. An actor is a predicate call that can be delayed, and can be activated later by events. Each time an actor is activated, an action can be executed. A predicate for actors contains at least one action rule in the form:
Head, Cond, {Event}=> Body |
where Head is an actor pattern, Cond is an optional condition, Event is a non-empty set of event patterns separated by ’,’, and Body is an action. For an actor and an event, an action rule is said to be applicable if the actor matches Head and Cond is true. A predicate for actors cannot contain backtrackable rules.
An event channel is an attributed variable to which actors can be attached, and through which events can be posted to actors. A channel has four ports: ins, bound, dom, and any. An event pattern in Event specifies the port to which the actor is attached. The event pattern ins(X) attaches the actor to the ins-port of channel X, and the actor will be activated when X is instantiated. The event pattern event(X,T) attaches the actor to the dom-port of channel X. The built-in post_event(X,T) posts an event term T to the dom-port of channel X. After an event is posted to a port of a channel, the actors attached to that port are activated. For an activated actor, the system searches for an applicable rule and executes the rule body if it finds one. After execution, the actor is suspended, waiting to be activated again by other events. Picat does not provide a built-in for detaching actors from channels. An actor fails if no rule is applicable to it when it is activated or the body of the applied rule fails. An actor becomes a normal call once a normal non-backtrackable rule is applied to it.
When a call echo(X,Flag) is executed, where Flag is a variable, it is attached to the dom-port of X as an actor. The actor is then suspended, waiting for events posted to the dom-port. For this actor definition, the command
prints out hello followed by picat. If the call foo(Flag) is inserted before the second call to post_event, then var(Flag) fails when the actor is activated the second time, causing the second rule to be applied to the actor. Then, the output will be hello followed by done. Note that events are not handled until a non-inline call is executed. Replacing foo(Flag) by Flag=1 will result in a different behavior because Flag=1 is an inline call.
Picat has three kinds of prebuilt maps: heap maps, global maps, and table maps. Prebuilt heap maps are created on the heap immediately after the system is started. The built-in function get_heap_map(ID) returns the heap map that is associated with ID, where ID must be a ground term. If no heap map is associated with ID, then this function establishes an association between ID and an unused heap map, and returns the map. A heap map is like a normal map. Users use put to add key-value pairs into the map. Users use get to retrieve a value that is associated with a key in the map. Changes to a heap map up to a choice point are undone when execution backtracks to that choice point. The built-in function get_heap_map() returns the heap map that is associated with a system-generated default identifier. There are an unlimited number of prebuilt heap maps.
Global maps are created in the global area when the Picat system is started. The built-in function get_global_map(ID) returns the global map that is associated with ID, where ID must be a ground term. If no global map is associated with ID, then this function establishes an association between ID and an unused global map, and returns the map. A big difference between a global map and a heap map is that changes to the global map are not undone upon backtracking. When a key-value pair is added into the global map, the variables in the value term are numbered before they are copied to the global area. If the value term contains attributed variables, then the attributes of the variables are not copied, and are therefore lost. When retrieving a value that is associated with a key, the value term in the global area is copied back to the heap after all of the numbered variables are unnumbered. The built-in function get_global_map() returns the global map that is associated with a system-generated default identifier. The number of prebuilt global maps is 97, and the system halts if a program requests more than 97 global maps.
Table maps are created in the table area when the Picat system is started. The built-in function get_table_map(ID) returns the table map that is associated with ID, where ID must be a ground term. If no table map is associated with ID, then this function establishes an association between ID and an unused table map, and returns the map. Like the global map, changes to a table map are not undone upon backtracking. Unlike the global map, however, keys and values are hash-consed so that common ground sub-terms are not replicated in the table area. The built-in function get_table_map() returns the table map that is associated with a system-generated default identifier. The number of prebuilt table maps is 97, and the system halts if a program requests more than 97 table maps.
The advantage of using prebuilt maps is that data can be accessed everywhere without being passed as arguments, and the disadvantage is that it affects locality of data and thus the readability of programs. In tabled programs, using prebuilt maps is discouraged because it may cause unanticipated effects.
For the call go, the output is:
The fail call in the first rule causes execution to backtrack to the second rule. After backtracking, the pair added to the heap map by the first rule is lost, but the pair added to the global map and the pair added to the table map remain.
Project Euler (projecteuler.net) is an excellent platform for practicing programming and problem solving skills. You can find Picat solutions to some of the problems at picat-lang.org/projects.html. Select five problems from the Project Euler problem set for which no solutions have been posted, and write a program in Picat for each of them.
The Picat system is written in both C and Picat. The Picat interpreter is provided as a single standalone executable file, named picat.exe for Windows and picat for Unix. The Picat interpreter provides an interactive programming environment for users to compile, load, debug, and execute programs. In order to start the Picat interpreter, users first need to open an OS terminal. In Windows, this can be done by selecting Start->Run and typing cmd or selecting Start->Programs->Accessories->Command Prompt. In order to start the Picat interpreter in any working directory, the environment variable path must be properly set to contain the directory where the executable is located.
The Picat interpreter is started with the OS command picat.
OSPrompt picat |
where OSPrompt is the OS prompt. After the interpreter is started, it responds with the prompt Picat>, and is ready to accept queries.
In general, the picat command takes the following form:
picat Opts PicatMainFileName A1 A2 … An |
where PicatMainFileName can have the extension .pi and can contain a path of directories, and Opts is a sequence of options of the following:
Once the interpreter is started, users can type a query after the prompt. For example,
The halt predicate, or the exit predicate, terminates the Picat interpreter. An alternative way to terminate the interpreter is to enter ctrl-d (control-d) when the cursor is located at the beginning of an empty line.
The Picat interpreter uses the getline program written by Chris Thewalt. The getline program memorizes up to 100 of the most recent queries that the users have typed, and allows users to recall past queries and edit the current query by using Emacs editing commands. The following gives the editing commands:
ctrl-f | Move the cursor one position forward. |
ctrl-b | Move the cursor one position backward. |
ctrl-a | Move the cursor to the beginning of the line. |
ctrl-e | Move the cursor to the end of the line. |
ctrl-d | Delete the character under the cursor. |
ctrl-h | Delete the character to the left of the cursor. |
ctrl-k | Delete the characters to the right of the cursor. |
ctrl-u | Delete the whole line. |
ctrl-p | Load the previous query in the buffer. |
ctrl-n | Load the next query in the buffer. |
Note that the command ctrl-d terminates the interpreter if the line is empty and the cursor is located in the beginning of the line.
A Picat program is stored in one or more text files with the extension name pi. A file name is a string of characters. Picat treats both ’/’ and ’' ’ as file name separators. Nevertheless, since ’' ’ is used as the escape character in quoted strings, two consecutive backslashes must be used, as in "c:' ' work' ' myfile.pi", if ’' ’ is used as the separator.
For the sake of demonstration we assume the existence of a file named welcome.pi in the current working directory that stores the following program:
After a program is loaded, users can query the program. For each query, the system executes the program, and reports yes when the query succeeds and no when the query fails. When a query that contains variables succeeds, the system also reports the bindings for the variables. For example,
Users can ask the system to find the next solution by typing ’;’ after a solution if the query has multiple solutions. For example,
Users can force a program to terminate by typing ctrl-c, or by letting it execute the built-in predicate abort. Note that when the system is engaged in certain tasks, such as garbage collection, users may need to wait for a while in order to see the termination after they type ctrl-c.
Programs that define the main/0 predicate or the main/1 predicate can be run directly as a OS command. For example,
The ’$’ sign is the prompt of the OS. It is assumed that the environment variable PATH has been set to contain the directory of the executable picat (picat.exe for Windows), and the environment variable PICATPATH has been set to contain the directory of the welcome.pi file or the file is in the current working directory.
It is possible to create a script that can be run as a standalone executable. For example, consider the following script welcome.exe for Linux:
Once the environment variables PATH and PICATPATH are set properly, and the script is set to have the execution permission, it can be executed as follows:
The Picat system has three execution modes: non-trace mode, trace mode, and spy mode. In trace mode, it is possible to trace the execution of a program, showing every call in every possible stage. In order to trace the execution, the program must be recompiled while the system is in trace mode. In spy mode, it is possible to trace the execution of individual functions and predicates that are spy points. When the Picat interpreter is started, it runs in non-trace mode. The predicate debug or trace changes the mode to trace. The predicate nodebug or notrace changes the mode to non-trace.
In trace mode, the debugger displays execution traces of queries. An execution trace consists of a sequence of call traces. Each call trace is a line that consists of a stage, the number of the call, and the information about the call itself. For a function call, there are two possible stages: Call, meaning the time at which the function is entered, and Exit, meaning the time at which the call is completed with an answer. For a predicate call, there are two additional possible stages: Redo, meaning a time at which execution backtracks to the call, and Fail, meaning the time at which the call is completed with a failure. The information about a call includes the name of the call, and the arguments. If the call is a function, then the call is followed by = and ? at the Call stage, and followed by = V alue at the Exit stage, where V alue is the return value of the call.
Consider, for example, the following program:
Assume the program is stored in a file named myprog.pi. The following shows a trace for a query:
In trace mode, the debugger displays every call in every possible stage. Users can set spy points so that the debugger only shows information about calls of the symbols that users are spying. Users can use the predicate
spy $Name/N |
to set the functor Name/N as a spy point, where the arity N is optional. If the functor is defined in multiple loaded modules, then all these definitions will be treated as spy points. If no arity is given, then any functor of Name is treated as a spy point, regardless of the arity.
After displaying a call trace, if the trace is for stage Call or stage Redo, then the debugger waits for a command from the users. A command is either a single letter followed by a carriage-return, or just a carriage-return. See Appendix B for the debugging commands.
Picat is a dynamically-typed language, in which type checking occurs at runtime. A variable gets a type once it is bound to a value. In Picat, variables and values are terms. A value can be primitive or compound. A primitive value can be an integer, a real number, or an atom. A compound value can be a list or a structure. Strings, arrays, maps, sets, and heaps are special compound values. This chapter describes the data types and the built-ins for each data type that are provided by the basic module.
Many of the built-ins are given as operators. Table 3.1 shows all of the operators that are provided by Picat. Unless the table specifies otherwise, the operators are left-associative. The as-pattern operator (@) and the operators for composing goals, including not, once, conjunction (, and &&), and disjunction (; and ||), will be described in Chapter 4 on Predicates and Functions. The constraint operators (the ones that begin with #) will be described in Chapter 12 on Constraints. In Picat, no new operators can be defined, and none of the existing operators can be redefined.
The dot operator (.) is used in OOP notations for calling predicates and functions. It is also used to qualify calls with a module name. The notation A1.f(A2,…,Ak) is the same as f(A1 , A2 , … , Ak ), unless A1 is an atom, in which case A1 must be a module qualifier for f. If an atom needs to be passed as the first argument to a function or a predicate, then this notation cannot be used. The notation A.Attr, where Attr does not have the form f(…), is the same as the function call get(A,Attr). For example, the expression S.name returns the name, and the expression S.arity returns the arity of S if S is a structure. Note that the dot operator is left-associative. For example, the expression X.f().g() is the same as g(f(X)).
Precedence | Operators |
Highest | ., @ |
** (right-associative) | |
unary +, unary -, ~ | |
*, /, //, /<, />, div, mod, rem | |
binary +, binary - | |
>>, << | |
/' | |
^ | |
' / | |
.. | |
++ (right-associative) | |
=, !=, :=, ==, !==, =:=, <, =<, <=, >, >=, ::, in, notin, =.. | |
#=, #!=, #<, #=<, #<=, #>, #>=, @<, @=<, @<=, @>, @>= | |
#~ | |
#/' | |
#^ | |
#' / | |
#=> (right-associative) | |
#<=> | |
not, once, ' + | |
, (right-associative), && (right-associative) | |
Lowest | ; (right-associative), || (right-associative) |
The following functions are provided for all terms:
Other built-ins on terms are given in Sections 3.5 and 3.8.
Variables in Picat, like variables in mathematics, are value holders. Unlike variables in imperative languages, Picat variables are not symbolic addresses of memory locations. A variable is said to be free if it does not hold any value. A variable is instantiated when it is bound to a value. Picat variables are single-assignment, which means that after a variable is instantiated to a value, the variable will have the same identity as the value. After execution backtracks over a point where a binding took place, the value that was assigned to a variable will be dropped, and the variable will be turned back into a free variable.
A variable name is an identifier that begins with a capital letter or the underscore. For example, the following are valid variable names:
The name _ is used for anonymous variables. In a program, different occurrences of _ are treated as different variables. So the test _ == _ is always false.
The following two built-ins are provided to test whether a term is a free variable:
An attributed variable is a variable that has a map of attribute-value pairs attached to it. The following built-ins are provided for attributed variables:
An atom is a symbolic constant. An atom name can either be quoted or unquoted. An unquoted name is an identifier that begins with a lower-case letter, followed by an optional string of letters, digits, and underscores. A quoted name is a single-quoted sequence of arbitrary characters. A character can be represented as a single-character atom. For example, the following are valid atom names:
No atom name can last more than one line. An atom name cannot contain more than 1000 characters. The backslash character ’' ’ is used as the escape character. So, the name ’a' ’b' n’ contains four characters: a, ’, b, and ' n.
The following built-ins are provided for atoms:
A number can be an integer or a real number. An integer can be a decimal numeral, a binary numeral, an octal numeral, or a hexadecimal numeral. In a numeral, digits can be separated by underscores, but underscore separators are ignored by the tokenizer. For example, the following are valid integers:
12_345 a decimal numeral 0b100 4 in binary notation 0o73 59 in octal notation 0xf7 247 in hexadecimal notation |
A real number consists of an optional integer part, an optional decimal fraction preceded by a decimal point, and an optional exponent. If an integer part exists, then it must be followed by either a fraction or an exponent in order to distinguish the real number from an integer literal. For example, the following are valid real numbers.
Table 3.2 gives the meaning of each of the numeric operators in Picat, from the operator with the highest precedence (**) to the one with the lowest precedence (..). Except for the power operator **, which is right-associative, all of the arithmetic operators are left-associative.
X ** Y | power |
+X | same as X |
-X | sign reversal |
~X | bitwise complement |
X * Y | multiplication |
X / Y | division |
X // Y | integer division, truncated |
X /> Y | integer division (ceiling(X / Y )) |
X /< Y | integer division (floor(X / Y )) |
X div Y | integer division, floored |
X mod Y | modulo, same as X - floor(X div Y ) * Y |
X rem Y | remainder (X - (X // Y ) * Y ) |
X + Y | addition |
X - Y | subtraction |
X » Y | right shift |
X « Y | left shift |
X /' Y | bitwise and |
X ^ Y | bitwise xor |
X ' / Y | bitwise or |
From .. Step .. To | A range (list) of numbers with a step |
From .. To | A range (list) of numbers with step 1 |
X =:= Y | pretty much (numerically) equal |
In addition to the numeric operators, the basic module also provides the following built-ins for numbers:
The math module provides more numeric functions. See Appendix A.
A compound term can be a list or a structure. Components of compound terms can be accessed with subscripts. Let X be a variable that references a compound value, and let I be an integer expression that represents a subscript. The index notation X[I] is a special function that returns the Ith component of X if I is an integer or a list of components if I is a range in the form of l..u, counting from the beginning. Subscripts begin at 1, meaning that X[1] is the first component of X. An index notation can take multiple subscripts. For example, the expression X[1,2] is the same as T[2], where T is a temporary variable that references the component that is returned by X[1]. The predicate compound(Term) is true if Term is a compound term.
A list takes the form [t1,…,tn], where each ti (1 ≤ i ≤ n) is a term. Let L be a list. The expression L.length, which is the same as the functions get(L,length) and length(L), returns the length of L. Note that a list is represented internally as a singly-linked list. Also note that the length of a list is not stored in memory; instead, it is recomputed each time that the function length is called.
The symbol ’|’ is not an operator, but a separator that separates the first element (so-called car) from the rest of the list (so-called cdr). The cons notation [H|T] can occur in a pattern or in an expression. When it occurs in a pattern, it matches any list in which H matches the car and T matches the cdr. When it occurs in an expression, it builds a list from H and T . The notation [A1 ,A2 ,… ,An |T] is a shorthand for [A1|[A2|…[An|T]…]. So [a,b,c] is the same as [a|[b|[c|[]]]].
The basic module provides the following built-ins on lists, most of which are overloaded for strings (3.4.2) and arrays (see 3.4.4).
sort(List).remove_dups() |
sort_remove_dups(List,KeyIndex) = SList: This function is the same as the following, but is faster.
sort(List,KeyIndex).remove_dups() |
sort_down(List) = SList: This function sorts the elements of List in descending order, returning the result in SList. sort_down(List,KeyIndex) = SList: This function sorts the elements of List by the key index KeyIndex in descending order, returning the result in SList. sort_down_remove_dups(List) = SList: This function is the same as the following, but is faster.
sort_down(List).remove_dups() |
sort_down_remove_dups(List,KeyIndex) = SList: This function is the same as the following, but is faster.
sort_down(List,KeyIndex).remove_dups() |
slice(List,From,To) = SList: This function returns the sliced list of List from index From through index To. From must not be less than 1. It is the same as the index notation List[From..To]. slice(List,From) = SList: This function is the same as the following.
slice(List,From,List.length) |
sum(List) = V al: This function returns the sum of all of the values in List. tail(List) = Term: This function returns the tail of the list List. For example, the call tail([1,2,3]) returns [2,3]. to_array(List) = Array: This function converts the list List to an array. The elements of the array are in the same order as the elements of the list. zip(List1 ,List2,…,Listn) = List: This function makes a list of array tuples. The jth tuple in the list takes the form {E1j,…,Enj}, where Eij is the jth element in Listi. In the current implementation, n can be 2, 3, or 4.
A string is represented as a list of single-character atoms. For example, the string "hello" is the same as the list [h,e,l,l,o]. In addition to the built-ins on lists, the following built-ins are provided for strings:
A structure takes the form $s(t1,…,tn), where s is an atom, and n is called the arity of the structure. The dollar symbol is used to distinguish a structure from a function call. The functor of a structure comprises the name and the arity of the structure.
The following types of structures can never denote functions, meaning that they do not need to be preceded by a $ symbol.
Goals: (a,b), (a;b), not a, X = Y, X != 100, X > 1 Constraints: X+Y #= 100, X #!= 1 Arrays: {2,3,4}, {P1,P2,P3} |
Picat disallows creation of the following types of structures:
Dot notations: math.pi, my_module.f(a) Index notations: X[1]+2 X[Y[I]] Assignments: X:=Y+Z, X:=X+1 Ranges: 1..10, 1..2..10 List comprehensions: [X : X in 1..5] Array comprehensions: {X : X in 1..5} If-then: if X>Y then Z=X else Z=Y end Loops: foreach (X in L) writeln(X) end |
The compiler will report a syntax error when it encounters any of these expressions within a term constructor.
The following built-ins are provided for structures:
An array takes the form {t1,…,tn}, which is a special structure with the name ’{}’ and arity n. Note that, unlike a list, an array always has its length stored in memory, so the function length(Array) always takes constant time. Also note that Picat supports constant-time access of array elements, so the index notation A[I] takes constant time when I is an integer.
In addition to the built-ins for structures, the following built-ins are provided for arrays:
The following built-ins, which are originally provided for lists (see 3.4.1), are overloaded for arrays:
Note that many of the overloaded built-ins for arrays are not implemented efficiently, but are provided for convenience. For example, sort(Array) is implemented as follows:
A map is a hash-table that is represented as a structure that contains a set of key-value pairs. The functor of the structure that is used for a map is not important. An implementation may ban access to the name and the arity of the structure of a map. Maps must be created with the built-in function new_map, unless they are prebuilt (see Section 1.11). In addition to the built-ins for structures, the following built-ins are provided for maps:
Most of the built-ins are overloaded for attributed variables.
A set is a map where every key is associated with the atom not_a_value. All of the built-ins for maps can be applied to sets. For example, the built-in predicate has_key(Set,Elm) tests if Elm is in Set. In addition to the built-ins on maps, the following built-ins are provided for sets:
A heap2 is a complete binary tree represented as an array. A heap can be a min-heap or a max-heap. In a min-heap, the value at the root of each subtree is the minimum among all the values in the subtree. In a max-heap, the value at the root of each subtree is the maximum among all the values in the subtree.
The equality test T1 == T2 is true if term T1 and term T2 are identical. Two variables are identical if they are aliases. Two primitive values are identical if they have the same type and the same internal representation. Two lists are identical if the cars are identical and the cdrs are identical. Two structures are identical if their functors are the same and their components are pairwise identical. The inequality test T1 !== T2 is the same as not T1 == T2. Note that two terms can be identical even if they are stored in different memory locations. Also note that it takes linear time in the worst case to test whether two terms are identical, unlike in C-family languages, in which the equality test operator == only compares addresses.
The unification T1 = T2 is true if term T1 and term T2 are already identical, or if they can be made identical by instantiating the variables in the terms. The built-in T1 != T2 is true if term T1 and term T2 are not unifiable. The predicate bind_vars(Term,V al) binds all of the variables in Term to V al.
The last query illustrates the occurs-check problem. When binding X to f(X), Picat does not check if X occurs in f(X) for the sake of efficiency. This unification creates a cyclic term, which can never be printed.
When a unification’s operands contain attributed variables, the implementation is more complex. When a plain variable is unified with an attributed variable, the plain variable is bound to the attributed variable. When two attributed variables, say Y and O, where Y is younger than O, are unified, Y is bound to O, but Y ’s attributes are not copied to O. Since garbage collection does not preserve the seniority of terms, the result of the unification of two attributed variables is normally unpredictable.
The numerical equality test T1 =:= T2 is true if term T1 and term T2 are pretty much the same numerical value. This means that T1 and T2 must both be numbers after evaluation. Whereas the test T1 == T2 fails if one number is an integer and one number is a real number, the test T1 =:= T2 may succeed if the difference between T1 and T2 is less than 0.00000001. Consider the following examples.
In the first query, 1 is an integer, while 1.0 is a real number, so the equality test fails. However, the second query, which is a numerical equality test, succeeds.
Picat orders terms in the following way:
var < number < atom < structure and array < list and string |
Variables are ordered by their addresses. Note that the ordering of variables may change after garbage collection. Numbers are ordered by their numerical values. Atoms are ordered lexicographically. Structures are first ordered lexicographically by their names; if their names are the same, then they are ordered by their components. Arrays are ordered as structures with the special name ’{}’. Lists and strings are ordered by their elements.
Expressions are made from variables, values, operators, and function calls. Expressions differ from terms in the following ways:
A conditional expression, which takes the form cond(Cond,Exp1,Exp2), is a special kind of function call that returns the value of Exp1 if the condition Cond is true and the value of Exp2 if Cond is false.
Note that, except for conditional expressions in which the conditions are made of predicates, no expressions can contain predicates. A predicate is true or false, but never returns any value.
A predicate or function is said to be higher-order if it takes calls as arguments. The basic module has the following higher-order predicates and functions.
In Picat, predicates and functions are defined with rules. Each rule is terminated by a dot (.) followed by a white space or the newline character.
Picat has two types of pattern-matching rules: the non-backtrackable rule
Head, Cond => Body. |
Head, Cond ?=> Body. |
Picat also supports Prolog-style Horn clauses and Definite Clause Grammar (DCG) rules for predicate definitions. A Horn clause takes the form:
Head :- Body. |
which can be written as Head if Body is true. A DCG rule takes the form:
Head --> Body. |
A predicate definition that consists of Horn clauses can be preceded by an index declaration, as described in Section 1.2. Picat converts Horn clauses and DCG rules into pattern-matching rules.
A predicate defines a relation, and can have zero, one, or multiple answers. Within a predicate, the Head is a pattern in the form p(t1,…,tn), where p is called the predicate name, and n is called the arity. When n = 0, the parentheses can be omitted. The condition Cond, which is an optional goal, specifies a condition under which the rule is applicable. Cond cannot succeed more than once. The compiler converts Cond to once Cond if would otherwise be possible for Cond to succeed more than once.
For a call C, if C matches the pattern p(t1,…,tn) and Cond is true, then the rule is said to be applicable to C. When applying a rule to call C, Picat rewrites C into Body. If the used rule is non-backtrackable, then the rewriting is a commitment, and the program can never backtrack to C. However, if the used rule is backtrackable, then the program will backtrack to C once Body fails, meaning that Body will be rewritten back to C, and the next applicable rule will be tried on C.
A predicate is said to be deterministic if it is defined with non-backtrackable rules only, non-deterministic if at least one of its rules is backtrackable, and globally deterministic if it is deterministic and all of the predicates in the bodies of the predicate’s rules are also globally deterministic. A deterministic predicate that is not globally deterministic can still have more than one answer.
The predicate append(Xs,Ys,Zs) is true if the concatenation of Xs and Ys is Zs. It defines a relation among the three arguments, and does not assume directionality of any of the arguments. For example, this predicate can be used to concatenate two lists, as in the call
this predicate can also be used to split a list nondeterministically into two sublists, as in the call append(L1,L2,[a,b,c,d]); this predicate can even be called with three free variables, as in the call append(L1,L2,L3).
The predicate min_max(L,Min,Max) returns two answers through its arguments. It binds Min to the minimum of list L, and binds Max to the maximum of list L. This predicate does not backtrack. Note that a call fails if the first argument is not a list. Also note that this predicate consumes linear space. A tail-recursive version of this predicate that consumes constant space will be given below.
A function is a special kind of a predicate that always succeeds with one answer. Within a function, the Head is an equation p(t1,…,tn)=X, where p is called the function name, and X is an expression that gives the return value. Functions are defined with non-backtrackable rules only.
For a call C, if C matches the pattern p(t1,…,tn) and Cond is true, then the rule is said to be applicable to C. When applying a rule to call C, Picat rewrites the equation C=X′ into (Body, X′=X), where X′ is a newly introduced variable that holds the return value of C.
Picat allows inclusion of function facts in the form p(t1,…,tn)=Exp in function definitions. The function fact p(t1,…,tn)=Exp is shorthand for the rule:
p(t1 ,… ,tn )=X => X=Exp. |
where X is a new variable.
Although all functions can be defined as predicates, it is preferable to define them as functions for two reasons. Firstly, functions often lead to more compact expressions than predicates, because arguments of function calls can be other function calls. Secondly, functions are easier to debug than predicates, because functions never fail and never return more than one answer.
The function qequation(A,B,C) returns the pair of roots of A*X2+B*X+C=0. If the discriminant B*B-4*A*C is negative, then an exception will be thrown.
The function rev(L) returns the reversed list of L. Note that the function rev(L) takes quadratic time and space in the length of L. A tail-recursive version that consumes linear time and space will be given below.
The pattern p(t1 ,…,tn) in the head of a rule takes the same form as a structure. Function calls are not allowed in patterns. Also, patterns cannot contain index notations, dot notations, ranges, array comprehensions, or list comprehensions. Pattern matching is used to decide whether a rule is applicable to a call. For a pattern P and a term T , term T matches pattern P if P is identical to T , or if P can be made identical to T by instantiating P ’s variables. Note that variables in the term do not get instantiated after the pattern matching. If term T is more general than pattern P , then the pattern matching can never succeed.
Unlike calls in many committed-choice languages, calls in Picat are never suspended if they are more general than the head patterns of the rules. A predicate call fails if it does not match the head pattern of any of the rules in the predicate. A function call throws an exception if it does not match the head pattern of any of the rules in the function. For example, for the function call rev(L), where L is a variable, Picat will throw the following exception:
unresolved_function_call(rev(L)). |
A pattern can contain as-patterns in the form V @Pattern, where V is a new variable in the rule, and Pattern is a non-variable term. The as-pattern V @Pattern is the same as Pattern in pattern matching, but after pattern matching succeeds, V is made to reference the term that matched Pattern. As-patterns can avoid re-constructing existing terms.
In the third rule, the as-pattern Ys@[Y|_] binds two variables: Ys references the second argument, and Y references the car of the argument. The rule can be rewritten as follows without using any as-pattern:
Nevertheless, this version is less efficient, because the cons [Y|Ys] needs to be re-constructed.
In a rule, both the condition and the body are goals. Queries that the users give to the interpreter are also goals. A goal can take one of the following forms:
The repeat predicate is often used to describe failure-driven loops. For example, the query
repeatedly outputs ’a’ until ctrl-c is typed.
if Cond1 then Goal1 elseif Cond2 then Goal2 elseif Condn then Goaln else Goalelse end |
where the elseif and else clauses are optional. If the else clause is missing, then the else goal is assumed to be true. For the if-then statement, Picat finds the first condition Condi that is true. If such a condition is found, then the truth value of the if-then statement is the same as Goali. If none of the conditions is true, then the truth value of the if-then statement is the same as Goalelse. Note that no condition can succeed more than once. throw Exception: This predicate throws the term Exception. This predicate will be detailed in Chapter 6 on Exceptions. !: This special predicate, called a cut, is provided for controlling backtracking. A cut in the body of a rule has the effect of removing the choice points, or alternative rules, of the goals to the left of the cut. Loops: Picat has three types of loop statements: foreach, while, and do-while. A loop statement is true if and only if every iteration of the loop is true. The details of loops are given in Chapter 5.
A rule is said to be tail-recursive if the last call of the body is the same predicate as the head. The last-call optimization enables last calls to reuse the stack frame of the head predicate if the frame is not protected by any choice points. This optimization is especially effective for tail recursion, because it converts recursion into iteration. Tail recursion runs faster and consumes less memory than non-tail recursion.
The trick to convert a predicate (or a function) into tail recursion is to define a helper that uses an accumulator parameter to accumulate the result. When the base case is reached, the accumulator is returned. At each iteration, the accumulator is updated. Initially, the original predicate (or function) calls the helper with an initial value for the accumulator parameter.
In the helper predicate min_max_helper(L,CMin,Min,CMax,Max), CMin and CMax are accumulators: CMin is the current minimum value, and CMax is the current maximum value. When L is empty, the accumulators are returned by the unification calls Min=CMin and Max=CMax. When L is a cons [H|T], the accumulators are updated: CMin changes to min(CMin,H), and CMax changes to max(CMax,H). The helper function rev_helper(L,R) follows the same idea: it uses an accumulator list to hold, in reverse order, the elements that have been scanned. When L is empty, the accumulator is returned. When L is the cons [X|Xs], the accumulator R changes to [X|R].
This chapter discusses variable assignments, loop constructs, and list and array comprehensions in Picat. It describes the scope of an assigned variable, indicating where the variable is defined, and where it is not defined. Finally, it shows how assignments, loops, and list comprehensions are related, and how they are compiled.
Picat variables are single-assignment, meaning that once a variable is bound to a value, the variable cannot be bound again. In order to simulate imperative language variables, Picat provides the assignment operator :=. An assignment takes the form LHS:=RHS, where LHS is either a variable or an access of a compound value in the form X[…]. When LHS is an access in the form X[I], the component of X indexed I is updated. This update is undone if execution backtracks over this assignment.
The compiler needs to give special consideration to the scope of a variable. The scope of a variable refers to the parts of a program where a variable occurs.
Consider the test example. This example binds X to 0. Then, the example tries to bind X to X + 1. However, X is still in scope, meaning that X is already bound to 0. Since X cannot be bound again, the compiler must perform extra operations in order to manage assignments that use the := operator.
In order to handle assignments, Picat creates new variables at compile time. In the test example, at compile time, Picat creates a new variable, say X1, to hold the value of X after the assignment X := X + 1. Picat replaces X by X1 on the LHS of the assignment. All occurrences of X after the assignment are replaced by X1. When encountering X1 := X1 + 2, Picat creates another new variable, say X2, to hold the value of X1 after the assignment, and replaces the remaining occurrences of X1 by X2. When write(X2) is executed, the value held in X2, which is 3, is printed. This means that the compiler rewrites the above example as follows:
This leads to the question: what does the compiler do if the code branches? Consider the following code skeleton.
The if_ex example performs exactly one assignment. At compilation time, the compiler does not know whether or not Z>0 evaluates to true. Therefore, the compiler does not know whether to introduce a new variable for X or for Y.
Therefore, when an if-else statement contains an assignment, the compiler rewrites the if-else statement as a predicate. For example, the compiler rewrites the above example as follows:
One rule is generated for each branch of the if-else statement. For each variable V that occurs on the LHS of an assignment statement that is inside of the if-else statement, predicate p is passed two arguments, Vin and Vout. In the above example, X and Y each occur on the LHS of an assignment statement. Therefore, predicate p is passed the parameters Xin, Xout, Yin, and Yout.
Picat has three types of loop statements for programming repetitions: foreach, while, and do-while.
foreach (E1 in D1, Cond1, …, En in Dn, Condn) Goal end |
Each Ei is an iterating pattern. Each Di is an expression that gives a compound value. Each Condi is an optional condition on iterators E1 through Ei.
Foreach loops can be used to iterate through compound values, as in the following examples.
The loop_ex1 example iterates through a list. The loop_ex2 example iterates through a map, where Key=Value is the iterating pattern.
The loop_ex1 example can also be written, using a failure-driven loop, as follows.
Recall that the range Start..Step..End stands for a list of numbers. Ranges can be used as compound values in iterators.
Also recall that the function zip(List1,List2,…,Listn) returns a list of tuples. This function can be used to simultaneously iterate over multiple lists.
Each of the previous examples uses a single iterator. Foreach loops can also contain multiple iterators.
If a foreach loop has multiple iterators, then it is compiled into a series of nested foreach loops in which each nested loop has a single iterator. In other words, a foreach loop with multiple iterators executes its goal once for every possible combination of values in the iterators.
The foreach loop in loop_ex4 is the same as the nested loop:
When the condition break(Cond) occurs to the right of an iterator in a foreach loop, it terminates the iterator if Cond is true. Note that the loop body is not executed when Cond is true.
The foreach loop in loop_ex6 prints out the elements of the array A, stopping when it encounters a 0.
while (Cond) Goal end |
As long as Cond succeeds, the loop will repeatedly execute Goal.
The while loop in loop_ex7 prints all of the odd numbers between 1 and 9. It is similar to the foreach loop
The while loop in loop_ex8 never executes its goal. J begins at 6, so the condition J <= 5 is never true, meaning that the body of the loop does not execute.
The while loop in loop_ex9 demonstrates a compound condition. The loop executes as long as the value that is read into E is either a multiple of 2 or a multiple of 5.
The while loop in loop_ex10 also demonstrates a compound condition. Unlike in loop_ex9, in which either condition must be true, in loop_ex10, both conditions must be true. The loop executes as long as the value that is read into E is both a multiple of 2 and a multiple of 5.
do Goal while (Cond) |
A do-while loop is similar to a while loop, except that a do-while loop executes Goal one time before testing Cond. The following example demonstrates the similarities and differences between do-while loops and while loops.
Unlike loop_ex8, loop_ex11 executes its body once. Although J begins at 6, the do-while loop prints J, and increments J before evaluating the condition J <= 5.
A list comprehension is a special functional notation for creating lists. List comprehensions have a similar format to foreach loops.
[T : E1 in D1, Cond1, …, En in Dn, Condn] |
T is an expression. Each Ei is an iterating pattern. Each Di is an expression that gives a compound value. Each Condi is an optional condition on iterators E1 through Ei.
An array comprehension takes the following form:
{T : E1 in D1, Cond1, …, En in Dn, Condn} |
It is the same as:
to_array([T : E1 in D1, Cond1, …, En in Dn, Condn]) |
Variables that occur in a loop, but do not occur before the loop in the outer scope, are local to each iteration of the loop. For example, in the rule
the variables I and E are local, and each iteration of the loop has its own values for these variables.
Consider the example:
In this example, the while loop contains an assignment statement. As mentioned above, at compilation time, Picat creates new variables in order to handle assignments. One new variable is created for each assignment. However, when this example is compiled, the compiler does not know the number of times that the body of the while loop can be executed. This means that the compiler does not know how many times the assignment I := I + 1 will occur, and the compiler is unable to create new variables for this assignment. In order to solve this problem, the compiler compiles while loops into tail-recursive predicates.
In the while_test example, the while loop is compiled into:
Note that the first rule of the predicate p(I, N) has the same condition as the while loop. The second rule, which has no condition, terminates the while loop, because the second rule is only executed if I > N. The call p(I1, N) is the tail-recursive call, with I1 storing the modified value.
Suppose that a while loop modifies a variable that is then used outside of the while loop. For each modified variable V that is used after the while loop, predicate p is passed two arguments, Vin and Vout. Then, a predicate that has the body true is not sufficient to terminate the compiled while loop.
The next example demonstrates a loop that has multiple accumulators, and that modifies values which are then used outside of the loop.
This loop finds the minimum and maximum values of a list. The loop is compiled to:
Notice that there are multiple accumulators: MinIn and MaxIn. Since the min_max predicate returns two values, the accumulators each have an in variable (MinIn and Maxin) and an out variable (MinOut and MaxOut). If the first parameter of predicate p is an empty list, then MinOut is set to the value of MinIn, and MaxOut is set to the value of MaxIn.
Foreach and do-while loops are compiled in a similar manner to while loops.
As mentioned above, variables that only occur within a loop are local to each iteration of the loop. In nested loops, variables that are local to the outer loop are global to the inner loop. In other words, if a variable occurs in the outer loop, then the variable is also visible in the inner loop. However, variables that are local to the inner loop are not visiable to the outer loop.
For example, consider the nested loops:
Variable I is local to the outer foreach loop, and is global to the inner foreach loop. Therefore, iterator J is able to iterate from I to I * I in the inner foreach loop. Iterator J is local to the inner loop, and does not occur in the outer loop.
Since a foreach loop with N iterators is converted into N nested foreach loops, the order of the iterators matters.
List comprehensions are compiled into foreach loops.
This list comprehension is compiled to:
Suppose that a user would like to create a list [Y, Y, Y, Y, Y]. The make_list1 predicate incorrectly attempts to make this list; instead, it outputs a list of 5 different variables since Y is local. In order to make all five variables the same, make_list2 makes variable Y global, by adding the line Y = Y to globalize Y.
An exception is an event that occurs during the execution of a program. An exception requires a special treatment. In Picat, an exception is just a term. A built-in exception is a structure, where the name denotes the type of the exception, and the arguments provide other information about the exception, such as the source, which is the goal or function that raised the exception.
Picat throws many types of exceptions. The following are some of the built-in exceptions:1
The built-in predicate throw(Exception) throws Exception. After an exception is thrown, the system searches for a handler for the exception. If none is found, then the system displays the exception and aborts the execution of the current query. It also prints the backtrace of the stack if it is in debug mode. For example, for the function call open("abc.txt"), the following message will be displayed if there is no file that is named "abc.txt".
All exceptions, including those raised by built-ins and interruptions, can be caught by catchers. A catcher is a call in the form:
which is equivalent to Goal, except when an exception is raised during the execution of Goal that unifies Exception. When such an exception is raised, all of the bindings that have been performed on variables in Goal will be undone, and RecoverGoal will be executed to handle the exception. Note that Exception is unified with a renamed copy of the exception before RecoverGoal is executed. Also note that only exceptions that are raised by a descendant call of Goal can be caught.
The call call_cleanup(Call,Cleanup) is equivalent to call(Call), except that Cleanup is called when Call succeeds determinately (i.e., with no remaining choice point), when Call fails, or when Call raises an exception.
The Picat system is a term-rewriting system. For a predicate call, Picat selects a matching rule and rewrites the call into the body of the rule. For a function call C, Picat rewrites the equation C = X where X is a variable that holds the return value of C. Due to the existence of recursion in programs, the term-rewriting process may never terminate. Consider, for example, the following program:
where the predicate edge defines a relation, and the predicate reach defines the transitive closure of the relation. For a query such as reach(a,X), the program never terminates due to the existence of left-recursion in the second rule. Even if the rule is converted to right-recursion, the query may still not terminate if the graph that is represented by the relation contains cycles.
Another issue with recursion is redundancy. Consider the following problem: Starting in the top left corner of a N × N grid, one can either go rightward or downward. How many routes are there through the grid to the bottom right corner? The following gives a program in Picat for the problem:
The function call route(20,1,1) returns the number of routes through a 20×20 grid. The function call route(N,1,1) takes exponential time in N, because the same function calls are repeatedly spawned during the execution, and are repeatedly resolved each time that they are spawned.
Tabling is a memoization technique that can prevent infinite loops and redundancy. The idea of tabling is to memorize the answers to subgoals and use the answers to resolve their variant descendants. In Picat, in order to have all of the calls and answers of a predicate or function tabled, users just need to add the keyword table before the first rule.
With tabling, all queries to the reach predicate are guaranteed to terminate, and the function call route(N,1,1) takes only N2 time.
For some problems, such as planning problems, it is infeasible to table all answers, because there may be an infinite number of answers. For some other problems, such as those that require the computation of aggregates, it is a waste to table non-contributing answers. Picat allows users to provide table modes to instruct the system about which answers to table. For a tabled predicate, users can give a table mode declaration in the form (M1,M2,…,Mn), where each Mi is one of the following: a plus-sign (+) indicates input, a minus-sign (-) indicates output, max indicates that the corresponding variable should be maximized, and min indicates that the corresponding variable should be minimized. The last mode Mn can be nt, which indicates that the argument is not tabled. Two types of data can be passed to a tabled predicate as an nt argument: (1) global data that are the same to all the calls of the predicate, and (2) data that are functionally dependent on the input arguments. Input arguments are assumed to be ground. Output arguments, including min and max arguments, are assumed to be variables. An argument with the mode min or max is called an objective argument. Only one argument can be an objective to be optimized. As an objective argument can be a compound value, this limit is not essential, and users can still specify multiple objective variables to be optimized. When a table mode declaration is provided, Picat tables only one optimal answer for the same input arguments.
The predicate edge(X,Y,W) specifies a weighted directed graph, where W is the weight of the edge between node X and node Y. The predicate sp(X,Y,Path,W) states that Path is a path from X to Y with the minimum weight W. Note that whenever the predicate sp/4 is called, the first two arguments must always be instantiated. For each pair, the system stores only one path with the minimum weight.
The following program finds a shortest path among those with the minimum weight for each pair of nodes:
For each pair of nodes, the pair of variables (W,Len) is minimized, where W is the weight, and Len is the length of a path. The built-in function compare_terms(T1,T2) is used to compare answers. Note that the order is important. If the term would be (Len,W), then the program would find a shortest path, breaking a tie by selecting one with the minimum weight.
The tabling system is useful for offering dynamic programming solutions for planning problems. The following shows a tabled program for general planning problems:
The predicate action(Action,S,S1) selects an action and performs the action on state S to generate another state, S1.
The program shown in Figure 7.1 solves the Farmer’s problem: The farmer wants to get his goat, wolf, and cabbage to the other side of the river. His boat isn’t very big, and it can only carry him and either his goat, his wolf, or his cabbage. If he leaves the goat alone with the cabbage, then the goat will gobble up the cabbage. If he leaves the wolf alone with the goat, then the wolf will gobble up the goat. When the farmer is present, the goat and cabbage are safe from being gobbled up by their predators.
The Picat tabling system employs the so-called linear tabling mechanism, which computes fixpoints by iteratively evaluating looping subgoals. The system uses a data area, called the table area, to store tabled subgoals and their answers. The tabling area can be initialized with the following built-in predicate:
This predicate clears up the table area. It’s the user’s responsibility to ensure that no data in the table area are referenced by any part of the application.
Linear tabling relies on the following three primitive operations to access and update the table area.
The planner module provides several predicates for solving planning problems. Given an initial state, a final state, and a set of possible actions, a planning problem is to find a plan that transforms the initial state to the final state. In order to use the planner module to solve a planning problem, users have to provide the condition for the final states and the state transition diagram through the following global predicates:
A state is normally a ground term. As all states are tabled during search, it is of paramount importance to find a good representation for states such that terms among states can be shared as much as possible.
In addition to the two required predicates given above, users can optionally provide the following global procedures to assist Picat in searching for plans:
These sequence rules ban robots from moving in an interleaving fashion; a robot must continue to move until it takes the action jump or wait before another robot can start moving. The last rule sequence(_, _) => true is necessary; it permits any action to be taken if the partial plan is empty, or if the most recent action in the partial plan is not move.
Depth-bounded search amounts to exploring the search space, taking into account the current available resource amount. A new state is only explored if the available resource amount is non-negative. When depth-bounded search is used, the function current_resource() can be used to retrieve the current resource amount. If the heuristic estimate of the cost to travel from the current state to the final state is greater than the available resource amount, then the current state fails.
In contrast to depth-bounded search, depth-unbounded search does not take into account the available resource amount. A new state can be explored even if no resource is available for the exploration. The advantage of depth-unbounded search is that failed states are never re-explored.
The program shown in Figure 8.1 solves the Farmer’s problem by using the planner module.
Figure 8.2 gives a program for the 15-puzzle problem. A state is represented as a list of sixteen locations, each of which takes the form (Ri,Ci), where Ri is a row number and Ci is a column number. The first element in the list gives the position of the empty square, and the remaining elements in the list give the positions of the numbered tiles from 1 to 15. The function heuristic(Tiles) returns the Manhattan distance between the current state and the final state.
A module is a bundle of predicate and function definitions that are stored in one file. A module forms a name space. Two definitions can have the same name if they reside in different modules. Because modules avoid name clashes, they are very useful for managing source files of large programs.
In Picat, source files must have the extension name ".pi". A module is a source file that begins with a module name declaration in the form:
module Name. |
where Name must be the same as the main file name. A file that does not begin with a module declaration is assumed to belong to the default global module. The following names are reserved for system modules and should not be used to name user modules: basic, bp, cp, glb, io, math, mip, nn, ordset, os, planner, sat, smt, sys, and util.
In order to use symbols that are defined in another module, users must explicitly import them with an import declaration in the form:
import Name1, …, Namen. |
where each imported Namei is a module name. For each imported module, the compiler first searches for it in the search path that is specified by the environment variable PICATPATH. If no module is found, the compiler gives an error message. Several modules are imported by default, including basic, io, math, and sys.
The import relation is not transitive. Suppose that there are three modules: A, B, and C. If A imports B and B imports C, then A still needs to import C in order to reference C’s symbols.
The built-in command cl("xxx") compiles the file xxx.pi and loads the generated code into the interpreter. The built-in command load("xxx") loads the bytecode file xxx.qi. It compiles the source file xxx.pi only when necessary. The load command also imports the public symbols defined in the module to the interpreter. This allows users to use these symbols on the command line without explicitly importing the symbols. If the file xxx.pi imports modules, those module files will be compiled and loaded when necessary.
The Picat system has a global symbol table for atoms, a global symbol table for structure names, and a global symbol table for modules. For each module, Picat maintains a symbol table for the public predicate and function symbols defined in the module. Private symbols that are defined in a module are compiled away, and are never stored in the symbol table. While predicate and function symbols can be local to a module, atoms and structures are always global.
The Picat module system is static, meaning that the binding of normal (or non-higher-order) calls to their definitions takes place at compile time. For each call, the compiler first searches the default modules for a definition that has the same name as the call. If no definition is found, then the compiler searches for a definition in the enclosing module. If no definition is found, the compiler searches the imported modules in the order that they were imported. If no definition is found in any of these modules, then the compiler will issue an warning1 , assuming the symbol is defined in the global module.
It is possible for two imported modules to contain different definitions that have the same name. When multiple names match a call, the order of the imported items determines which definition is used. Picat allows users to use qualified names to explicitly select a definition. A module-qualified call is a call preceded by a module name and ’.’ without intervening whitespace.
The module qsort.pi defines a function named sort using quick sort, and the module isort defines a function of the same name using insertion sort. In the following session, both modules are used.
As sort is also defined in the basic module, which is preloaded, that function is used for the command L=sort([2,1,3]).
Module names are just atoms. Consequently, it is possible to bind a variable to a module name. Nevertheless, in a module-qualified call M.C, the module name can never be a variable. Recall that the dot notation is also used to access attributes and to call predicates and functions. The notation M.C is treated as a call or an attribute if M is not an atom.
Suppose that users want to define a function named generic_sort(M,L) that sorts list L using the sort function defined in module M. Users cannot just call M.sort(L), since M is a variable. Users can, however, select a function based on the value held in M by using function facts as follows:
Because Picat forbids variable module qualifiers and terms in dot notations, it is impossible to create module-qualified higher-order terms. For a higher-order call, if the compiler knows the name of the higher-order term, as in findall(X,member(X,L)), then it searches for a definition for the name, just like it does for a normal call. However, if the name is unknown, as in apply(F,X,Y), then the compiler generates code to search for a definition. For a higher-order call to call/N or apply/N, the runtime system searches modules in the following order:
As private symbols are compiled away at compile time, higher-order terms can never reference private symbols. Due to the overhead of runtime search, the use of higher-order calls is discouraged.
Picat comes with a library of standard modules, described in separate chapters. The function sys.loaded_modules() returns a list of modules that are currently in the system.
Picat has an io module for reading input from files and writing output to files. The io module is imported by default.
The io module contains functions and predicates that read from a file, write to a file, reposition the read/write pointer within a file, redirect input and output, and create temporary files and pipes.
The io module uses file descriptors to read input from files, and to write output to files. A file descriptor is a structure that encodes file descriptor data, including an index in a file descriptor table that stores information about opened files. The following example reads data from one file, and writes the data into another file.
There are two functions for opening a file. Both of them are used in the previous example.
The io module has at least one function for reading data into each primitive data type. It also has functions for reading characters, tokens, strings, and bytes. Recall that strings are stored as lists of single-character atoms.
The read functions in the io module take a file descriptor as the first parameter. This file descriptor is the same descriptor that the open function returns. The parameter can be omitted if it is the standard input file stdin.
There are cases when the read_char(FD,N), and read_byte(FD,N) functions will read fewer than N values. One case occurs when the end of the file is encountered. Another case occurs when reading from a pipe. If a pipe is empty, then the read functions wait until data is written to the pipe. As soon as the pipe has data, the read functions read the data. If a pipe has fewer than N values when a read occurs, then these three functions will return a string that contains all of the values that are currently in the pipe, without waiting for more values. In order to determine the actual number of elements that were read, after the functions return, use length(List) to check the length of the list that was returned.
The io module also has functions that peek at the next value in the file without changing the current file location. This means that the next read or peek function will return the same value, unless the read/write pointer is repositioned or the file is modified.
The end of a file is detected through the end_of_file atom. If the input function returns a single value, and the read/write pointer is at the end of the file, then the end_of_file atom is returned. If the input function returns a list, then the end-of-file behavior is more complex. If no other values have been read into the list, then the end_of_file atom is returned. However, if other values have already been read into the list, then reaching the end of the file causes the function to return the list, and the end_of_file atom will not be returned until the next input function is called.
Instead of checking for end_of_file, the at_end_of_stream predicate can be used to monitor a file descriptor for the end of a file.
The advantage of using the at_end_of_stream predicate instead of using the end_of_file atom is that at_end_of_stream immediately indicates that the end of the file was reached, even if the last read function read values into a list. In the first example in this chapter, which used the end_of_file atom, an extra read_line function was needed before the end of the file was detected. In the above example, which used at_end_of_stream, read_line was only called if there was data remaining to be read.
The write and print predicates take a file descriptor as the first parameter. The file descriptor is the same descriptor that the open function returns. If the file descriptor is stdout, then the parameter can be omitted.
Note that these predicates write both primitive values and compound values.
The writef predicate includes a parameter that specifies the string that is to be formatted. The Format parameter is a string that contains format characters. Format characters take the form %[flags][width][.precision]specifier. Only the percent sign and the specifier are mandatory. Flags can be used for justification and padding. The width is the minimum number of characters that are to be printed. The precision is the number of characters that are to be printed after the number’s radix point. Note that the width includes all characters, including the radix point and the characters that follow it. The specifier indicates the type of data that is to be written. A specifier can be one of the C format specifiers %%, %c,1 %d, %e, %E, %f, %g, %G, %i, %o, %s, %u, %x, and %X. In addition, Picat uses the specifier %n for newlines, and uses %w for terms. For details, see Appendix F.
This writes, “Hello, Bob. Happy birthday! You are 7 years old today. That is 16.67% older than you were last year.”
The io module also has the three print predicates.
The following example demonstrates the differences between the write and print predicates.
The io module has one predicate to flush a file stream, and one predicate to close a file stream.
The atoms stdin, stdout, and stderr represent the file descriptors for standard input, standard output, and standard error. These atoms allow the program to use the input and output functions of the io module to read from and to write to the three standard streams.
Many applications require event-driven computing. For example, an interactive GUI system needs to react to UI events such as mouse clicks on UI components; a Web service provider needs to respond to service requests; a constraint propagator for a constraint needs to react to updates to the domains of the variables in the constraint. Picat provides action rules for describing event-driven actors. An actor is a predicate call that can be delayed and can be activated later by events. Actors communicate with each other through event channels.
An event channel is an attributed variable to which actors can be attached, and through which events can be posted to actors. A channel has four ports, named ins, bound, dom, and any, respectively. Many built-ins in Picat post events. When an attributed variable is instantiated, an event is posted to the ins-port of the variable. When the lower bound or upper bound of a variable’s domain changes, an event is posted to the bound-port of the variable. When an inner element E, which is neither the lower or upper bound, is excluded from the domain of a variable, E is posted to the dom-port of the variable. When an arbitrary element E, which can be the lower or upper bound or an inner element, is excluded from the domain of a variable, E is posted to the any-port of the variable. The division of a channel into ports facilitates speedy handling of events. For better performance, the system posts an event to a port only when there are actors attached to the port. For example, if no actor is attached to a domain variable to handle exclusions of domain elements, then these events will never be posted.
The built-in post_event(X,T) posts the event term T to the dom-port of the channel variable X.
The following built-ins are used to post events to one of a channel’s four ports:
The call post_event(X,T) is the same as post_event_dom(X,T). This means that the dom-port of a finite domain variable has two uses: posting exclusions of inner elements from the domain, and posting general term events.
Picat provides action rules for describing the behaviors of actors. An action rule takes the following form:
Head, Cond, {Event} => Body |
where Head is an actor pattern, Cond is an optional condition, Event is a non-empty set of event patterns separated by ’,’, and Body is an action. For an actor that is activated by an event, an action rule is said to be applicable if the actor matches Head and Cond is true. A predicate for actors is defined with action rules and non-backtrackable rules. It cannot contain backtrackable rules.
Unlike rules for a normal predicate or function, in which the conditions can contain any predicates, the conditions of the rules in a predicate for actors must be conjunctions of inline test predicates, such as type-checking built-ins (e.g., integer(X) and var(X)) and comparison built-ins (e.g., equality test X == Y , disequality test X !== Y , and arithmetic comparison X > Y ). This restriction ensures that no variables in an actor can be changed while the condition is executed.
For an actor that is activated by an event, the system searches the definition sequentially from the top for an applicable rule. If no applicable rule is found, then the actor fails. If an applicable rule is found, the system executes the body of the rule. If the body fails, then the actor also fails. The body cannot succeed more than once. The system enforces this by converting Body into ‘once Body’ if Body contains calls to nondeterministic predicates. If the applied rule is an action rule, then the actor is suspended after the body is executed, meaning that the actor is waiting to be activated again. If the applied rule is a normal non-backtrackable rule, then the actor vanishes after the body is executed. For each activation, only the first applicable rule is applied.
For a call and an action rule ‘Head,Cond,{Event} => Body’, the call is registered as an actor if the call matches Head and Cond evaluates to true. The event pattern Event implicitly specifies the ports to which the actor is attached, and the events that the actor watches. The following event patterns are allowed in Event:
In an action rule, multiple event patterns can be specified. After a call is registered as an actor on the channels, it will be suspended, waiting for events, unless the atom generated occurs in Event, in which case the actor will be suspended after Body is executed.
The system has an event queue. After events are posted, they are added into the queue. Events are not handled until execution enters or exits a non-inline predicate or function. In other words, only non-inline predicates and functions can be interrupted, and inline predicates, such as X = Y , and inline functions, such as X + Y , are never interrupted by events.
Consider the following action rule:
The following gives a query and its output:
The call p(X) is an actor. After X.post_event(ping), the actor is activated and the body of the action rule is executed, giving the output ping. After X.post_event(pong), the actor is activated again, outputting pong.
There is no primitive for killing actors or explicitly detaching actors from channels. As described above, an actor never disappears as long as action rules are applied to it. An actor vanishes only when a normal rule is applied to it. Consider the following example.
An actor defined here can only handle one event posting. After it handles an event, it binds the variable Flag. When a second event is posted, the action rule is no longer applicable, causing the second rule to be selected.
One question arises here: what happens if a second event is never posted to X? In this case, the actor will stay forever. If users want to immediately kill the actor after it is activated once, then users have to define it as follows:
In this way, the actor will be activated again after Flag is bound to 1, and will be killed after the second rule is applied to it.
The built-in predicate freeze(X,Goal) is equivalent to ‘once Goal’, but its evaluation is delayed until X is bound to a non-variable term. The predicate is defined as follows:
For the call freeze(X,Goal), if X is a variable, then X is registered as an actor on the ins-port of X, and X is then suspended. Whenever X is bound, the event ins is posted to the ins-port of X, which activates the actor freeze(X,Goal). The condition var(X) is checked. If true, the actor is suspended again; otherwise, the second rule is executed, causing the actor to vanish after it is rewritten into once Goal.
The built-in predicate different_terms(T1,T2) is a disequality constraint on terms T1 and T2. The constraint fails if the two terms are identical; it succeeds whenever the two terms are found to be different; it is delayed if no decision can be made because the terms are not sufficiently instantiated. The predicate is defined as follows:
The call different_terms(X,Y,B) is delayed if either X or Y is a variable. The delayed call watches ins(X) and ins(Y) events. Once both X and Y become non-variable, the action rule becomes inapplicable, and one of the subsequent rules will be applied. If X and Y are lists, then they are different if the heads are different (B1), or if the tails are different (B2). This relationship is represented as the Boolean constraint B #= (B1 #' / B2). If X and Y are both structures, then they are different if the functor is different, or if any pair of arguments of the structures is different.
A constraint propagator is an actor that reacts to updates of the domains of the variables in a constraint. The following predicate defines a propagator for maintaining arc consistency on X for the constraint X+Y #= C:
Whenever an inner element Ey is excluded from the domain of Y, this propagator is triggered to exclude C-Ey, which is the support of Ey, from the domain of X. For the constraint X+Y #= C, users need to generate two propagators, namely, x_in_c_y_ac(X,Y,C) and x_in_c_y_ac(Y,X,C), to maintain the arc consistency. Note that in addition to these two propagators, users also need to generate propagators for maintaining interval consistency, because dom(Y,Ey) only captures exclusions of inner elements, and does not capture bounds. The following propagator maintains interval consistency for the constraint:
When both X and Y are variables, the propagator x_add_y_eq_c_ic(X,Y,C) is activated whenever X and Y are instantiated, or whenever the bounds of their domains are updated. The body maintains the interval consistency of the constraint X+Y #= C. The body is also executed when the propagator is generated. When either X or Y becomes non-variable, the propagator becomes a normal call, and vanishes after the variable X or Y is solved.
Picat provides four solver modules, including cp, sat, smt, and mip, for modeling and solving constraint satisfaction and optimization problems (CSPs). All four of these modules implement the same set of constraints on integer-domain variables. The mip module also supports real-domain variables. In order to use a solver, users must first import the module. In order to make the symbols defined in a module available to the top level of the interpreter, users can use the built-in import to import the module. The mip module requires Gurobi (http://www.gurobi.com/), Cbc (https://github.com/coin-or/Cbc), or GLPK (https://www.gnu.org/software/glpk/). The smt module requires Z3 (https://github.com/Z3Prover/z3/) or CVC4 (https://cvc4.github.io/). Users can specify, as a solving option, a solver to be used if the module is mip or smt.
As the four modules have the same interface, this chapter describes the four modules together. Figure 12.1 shows the constraint operators that are provided by Picat. Unless it is explicitly specified otherwise, the built-ins that are described in this chapter appear in all four modules. In the built-ins that are presented in this chapter, an integer-domain variable can also be an integer, unless it is explicitly specified to only be a variable.
Precedence | Operators |
Highest | ::, notin, #=, #!=, #<, #=<, #<=, #>, #>= |
#~ | |
#/' | |
#^ | |
#' / | |
#=> | |
Lowest | #<=> |
A constraint program normally poses a problem in three steps: (1) generate variables; (2) generate constraints over the variables; and (3) call solve to find a valuation for the variables that satisfies the constraints and possibly optimizes an objective function.
This program in Figure 12.1 imports the cp module in order to solve the N-queens problem. The same program runs with the SAT solver if sat is imported, runs with the SMT solver if smt is imported, and runs with the LP/MIP solver if mip is imported. The predicate Qs :: 1..N declares the domains of the variables. The operator #!= is used for inequality constraints. In arithmetic constraints, expressions are treated as terms, and it is unnecessary to enclose them with dollar-signs. The predicate solve(Qs) calls the solver in order to solve the array of variables Qs. For cp, solve([ff],Qs), which always selects a variable that has the smallest domain (the so-called first-fail principle), can be more efficient than solve(Qs).
A domain variable is an attributed variable that has a domain attribute. The Boolean domain is treated as a special integer domain where 1 denotes true and 0 denotes false. Domain variables are declared with the built-in predicate V ars :: Exp.
Domain variables, when being created, are usually represented internally by using intervals. An interval turns to a bit vector when a hole occurs in the interval. The following built-in predicate can be used to reset the range or access the current range.
The following built-ins are provided for domain variables.
A table constraint, or an extensional constraint, over a tuple of variables specifies a set of tuples that are allowed (called positive) or disallowed (called negative) for the variables. A positive constraint takes the form table_in(DV ars,R), where DV ars is either a tuple of variables {X1,…,Xn} or a list of tuples of variables, and R is a list of tuples in which each tuple takes the form {a1 , … , an }, where ai is an integer or the don’t-care symbol *. A negative constraint takes the form table_notin(DV ars,R).
The following example solves a toy crossword puzzle. One variable is used for each cell in the grid, so each slot corresponds to a tuple of variables. Each word is represented as a tuple of integers, and each slot takes on a set of words of the same length as the slot. Recall that the function ord(Char) returns the code of Char, and that the function chr(Code) returns the character of Code.
An arithmetic constraint takes the form
Exp1 Rel Exp2 |
where Exp1 and Exp2 are arithmetic expressions, and Rel is one of the constraint operators: #=, #!=, #<, #=<, #<=, #>, or #>=. The operators #=< and #<= are the same, meaning less than or equal to. An arithmetic expression is made from integers, variables, arithmetic functions, and constraints. The following arithmetic functions are allowed: + (addition), - (subtraction), * (multiplication), / (truncated integer division), // (truncated integer division), count, div (floored integer division), mod, ** (power), abs, min, max, and sum. Except for index notations, array comprehensions and list comprehensions, which are interpreted as function calls as in normal expressions, expressions in arithmetic constraints are treated as terms, and it is unnecessary to enclose them with dollar-signs. In addition to the numeric operators, the following functions are allowed in constraints:
When a constraint occurs in an arithmetic expression, it is evaluated to 1 if it is satisfied and 0 if it is not satisfied.
This program uses MIP to solve the maximum integer flow problem. Given the capacity matrix M of a directed graph, the start vertex Source, and the destination vertex Sink, the predicate maxflow(M,Source,Sink) finds a maximum flow from Source to Sink over the graph. When two vertices are not connected by an arc, the capacity is given as 0. The first foreach loop specifies the domains of the variables. For each variable X[I,J], the domain is restricted to integers between 0 and the capacity, M[I,J]. If the capacity is 0, then the variable is immediately instantiated to 0. The next foreach loop posts the conservation constraints. For each vertex I, if it is neither the source nor the sink, then its total incoming flow amount
is equal to the total outgoing flow amount
The total flow amount is the total outgoing amount from the source, which is the same as the total incoming amount to the sink.
A Boolean constraint takes one of the following forms:
#~ BoolExp BoolExp #/' BoolExp BoolExp #^ BoolExp BoolExp #' / BoolExp BoolExp #=> BoolExp BoolExp #<=> BoolExp |
BoolExp is either a Boolean constant (0 or 1), a Boolean variable (an integer-domain variable with the domain [0,1]), an arithmetic constraint, a domain constraint (in the form of V ar :: Domain or V ar notin Domain), or a Boolean constraint. As shown in Table 12.1, the operator #~ has the highest precedence, and the operator #<=> has the lowest precedence. Note that the Boolean constraint operators have lower precedence than the arithmetic constraint operators. So the constraint
is interpreted as
The Boolean constraint operators are defined as follows.
A global constraint is a constraint over multiple variables. A global constraint can normally be translated into a set of smaller constraints, such as arithmetic and Boolean constraints. If the cp module is used, then global constraints are not translated into smaller constraints; rather, they are compiled into special propagators that maintain a certain level of consistency for the constraints. In Picat, constraint propagators are encoded as action rules. If the sat module is used, then global constraints are translated into smaller constraints before being translated further into conjunctive normal form. If the mip module is used, then global constraints are decomposed into equality and disequality constraints.
Picat provides the following global constraints.
[3,4,2,1] is a solution, but [2,1,4,3] is not, because the graph 1->2, 2->1, 3->4, 4->3 contains two sub-cycles.
Note that there is an edge between each pair of neighboring cells in the resulting graph as long as the cells are in the graph.
The following options are accepted by all four of the solvers.2
The cp module also accepts the following options:
cbc TmpFile solve -solu SolFile |
where SolFile is a file for the solution, and TmpFile is a file that stores the CPLEX-format constraints. Picat throws existence_error if the command cbc is not available. dump: Dump the constraints in CPLEX format to stdout. dump(File): Dump the CPLEX format to File. glpk: Instruct Picat to use the GLPK MIP solver. Picat uses the following command to call the GLPK solver:
glpsol -lp -o SolFile TmpFile |
where SolFile is a solution file, and TmpFile is a file that stores the CPLEX-format constraints. Picat throws existence_error if the command glpsol is not available. gurobi: Instruct Picat to use the Gurobi MIP solver. Picat uses the following command to call the Gurobi solver:
gurobi_cl ResultFile=SolFile TmpFile |
where SolFile is a file for the solution, and TmpFile is a file that stores the CPLEX-format constraints. Picat throws existence_error if the command gurobi_cl is not available. tmp(File): Dump the CPLEX format to File rather than the default file “__tmp.lp”, before calling the mip solver. The name File must be a string or an atom that has the extension name “.lp”. When this file name is specified, the mip solver will save the solution into a file name that has the same main name as File but the extension name “.sol”.
cvc4 TmpFile > SolFile |
where TmpFile is a file that stores the SMT-LIB2-format constraints, and SolFile is a solution file. Picat throws existence_error if the command cvc4 is not available in the path. dump: Dump the constraints in SMT-LIB2 format to stdout. dump(File): Dump the SMT-LIB2 format to File. logic(Logic): Instruct the SMT solver to use Logic in the solving, where Logic must be an atom or a string, and the specified logic must be available in the SMT solver. The default logic for Z3 is “LIA”, and the default logic for CVC4 is “NIA”. tmp(File): Dump the SMT-LIB2 format to File rather than the default file “__tmp.smt2”, before calling the smt solver. The name File must be a string or an atom that has the extension name “.smt2”. When this file name is specified, the smt solver will save the solution into a file name that has the same main name as File but the extension name “.sol”. z3: Instruct Picat to use the z3 SMT solver. When no SMT solver is specified, Picat first searches for the command z3, and when z3 cannot be found it continues to search for the command cvc4.
The nn module provides a high-level interface between Picat and the FANN1 neural networks library, which implements feedforward neural networks.2 A feedforward neural network consists of neurons organized in layers from an input layer to an output layer, possibly with a number of hidden layers. A feedforward network represents a function from input to output. Neurons in a layer (except for the input layer) are connected to neurons in the previous layer. The connections have weights. The neurons in the input layer receive the input. The information is propagated forward through the layers until it reaches the output layer, where the output is returned. The information that a neuron receives is determined by the connected predecessor neurons, the weights of the connections, and an activation function. The connection weights of a neural network are normally adjusted through training on a given set of input-output pairs, called training data. Once a neural network is trained, it can be used to predict the output for a given input.
The following gives an example program which creates a neural network for the xor function, and trains it on a set of data stored in a file:
The function new_nn({2,3,1}) returns a neural network with three layers, where the input layer has 2 neurons, the hidden layer has 3 neurons, and the output layer has 1 neuron. The program does not specify any activation functions used between layers, entailing that the default activation function, which is sym_sigmoid, will be used. The predicate nn_train(NN,"xor.data") trains the neural network with the training data stored in the file "xor.data". The user is able to specify an algorithm to be used in the training and several parameters that affect the behavior of the algorithm, such as the maximum number of iterations (called epochs), the learning rate, and the error function. This example does not specify a training algorithm or any of the training parameters, entailing that the default algorithm, which is rprop, is used with the default setting. The predicate nn_save saves the trained neural network into a file named "xor.net". The predicate nn_destroy_all clears the neural network and the internal data structures used during training.
The text file "xor.data" contains the following training data:
The three integers in the first line state, respectively, that the number of input-output pairs is 4, the number of input values is 2, and the number of output values is 1. The remaining lines give the input-output pairs.
The following program performs the same task as the above program, except that it trains the neural network with internal data:
The predicate nn_train is overloaded. When the second argument is a file name, Picat reads training data from the file. Otherwise, Picat expects a collection (a list or an array) of input-output pairs.
The following example program illustrates how to use a trained network:
The function nn_load loads a neural network. The function nn_run uses the network to predict the output for an input.
An activation function for a neuron determines how information is propagated to it from its predecessor neurons. When a new neural network is created, it uses the default activation function sym_sigmoid for all of its non-input neurons. The following predicates can be utilized to set activation functions.
The detault activation function is sym_sigmoid. Each of these functions has a corresponding name in FANN. Please refer to the FANN documentation for a more detailed description of these functions.
A training dataset can be supplied to FANN either through a text file or a Picat collection. A training dataset file must have the following format:
A training dataset stored in a Picat collection must be either a list or an array of input-output pairs. An input-output pair has the form (Input,Output), where Input is an array of numbers or a single number, and so is Output.
Picat has an os module for manipulating files and directories. In order to use any of the functions or predicates, users must import the module.
Many of the functions and predicates in this module have a Path parameter. This parameter is a string or an atom, representing the path of a file or directory. This path can be an absolute path, from the system’s root directory, or a relative path, from the current file location. Different systems use different separator characters to separate directories in different levels of the directory hierarchy. For example, Windows uses ‘\’ and Unix uses ‘/’. The following function outputs a single character, representing the character that the current system uses as a file separator.
The os module includes functions for reading and modifying directories. The following example shows how to list all of the files in a directory tree, using a depth-first directory traversal.
The following function can be used to read the contents of a directory:
The above example also uses the directory predicate, which will be discussed in Section 14.4.
The os module includes two functions that obtain the program’s current working directory:
The os module also includes two predicates to change the program’s current working directory:
If the cd and chdir predicates cannot move to the directory specified by Path, the functions throw an error. This can occur if Path does not exist, if Path is not a directory, or if the program does not have permission to access Path.
The os module contains a number of predicates for creating new files and directories:
The os module contains a number of predicates for deleting files and directories.
The os module contains a number of functions that retrieve file status information, and predicates that test the type of a file. These predicates will all throw an error if the program does not have permission to read from Path.
The following example shows how to use a few of the predicates.
Picat provides a math module, which has common mathematical constants and functions. The math module is imported by default.
The math module provides two constants.
The math module contains mathematical functions that serve a number of different purposes. Note that the arguments must all be numbers. If the arguments are not numbers, then Picat will throw an error.
The following functions deal with the positivity and negativity of numbers.
The math module includes the following functions for converting a real number into the integers that are closest to the number.
The following functions provide exponentiation, root, and logarithmic functions. Note that, in the logarithmic functions, if X ≤ 0, then an error is thrown.
The math module has two functions to convert between degrees and radians.
The math module provides the following trigonometric functions.
The math module provides the following hyperbolic functions.
The following functions provide access to a random number generator.
The sys module, which is imported by default, contains built-ins that are relevant to the Picat system. The built-ins in the sys module perform operations that include compiling programs, tracing execution, and displaying statistics and information about the Picat system.
The sys module includes a number of built-ins for compiling programs and loading them into memory.
The Picat system has three execution modes: non-trace mode, trace mode, and spy mode. In trace mode, it is possible to trace the execution of a program, showing every call in every possible stage. In order to trace the execution, the program must be recompiled while the system is in trace mode. In spy mode, it is possible to trace the execution of individual functions and predicates. The following predicates are used to switch between non-trace mode and trace mode.
In trace mode, the system displays a message when a function or a predicate is entered (Call), exited (Exit), re-entered (Redo) or has failed (Fail). After a function or a predicate is entered or re-entered, the system waits for a command from the user. A command is a single letter followed by a carriage-return, or may simply be a carriage-return. The following commands are available:
The sys module contains a number of built-ins that display information about the Picat system. This information includes statistics about the system, including the memory that is used, and the amount of time that it takes to perform a goal.
The following built-ins display statistics about the memory that Picat system uses.
The following predicates display the amount of CPU time that it takes to perform a goal.
Picat incorporates an incremental garbage collector for the control stack and the heap. The garbage collector is active by default. The sys module includes the following predicates for garbage collection.
The following predicates can be used to terminate the Picat interpreter.
The util module provides general useful utility functions and predicates. This module is expected to be expanded in the future. This module must be imported before use.
An array matrix is a two-dimensional array. The first dimension gives the number of the rows and the second dimension gives the number of the columns. A list matrix represents a matrix as a list of lists.
An ordered set is represented as a sorted list that does not contain duplicates. The ordset module provides useful utility functions and predicates on ordered sets. This module must be imported before use.
Picat’s datetime module provides built-ins for retrieving the date and time. This module must be imported before use.
date_time(Y ear,Month,Day,Hour,Minute,Second) |
where the arguments are all integers, and have the following meanings and ranges.
Argument | Meaning | Range |
Y ear | years since 1900 | an integer |
Month | months since January | 0-11 |
Day | day of the month | 1-31 |
Hour | hours since midnight | 0-23 |
Minute | minutes after the hour | 0-59 |
Second | seconds after the minute | 0-60 |
In the Month argument, 0 represents January, and 11 represents December. In the Hour argument, 0 represents 12 AM, and 23 represents 11 PM. In the Second argument, the value 60 represents a leap second. current_date() = Date: This function returns the current date as a structure in the form date(Y ear,Month,Day), where the arguments have the meanings and ranges that are defined above. current_day() = WDay: This function returns the number of days since Sunday, in the range 0 to 6. current_time() = Time: This function returns the current time as a structure in the form time(Hour,Minute,Second), where the arguments have the meanings and ranges that are defined above.
The following table shows the specifiers that can be used in formats for the writef, printf, and to_fstring.
Specifier | Output |
%% | Percent Sign |
%c | Character |
%d | Signed Decimal Integer |
%e | Scientific Notation, with Lowercase e |
%E | Scientific Notation, with Uppercase E |
%f | Decimal Real Number |
%g | Shorter of %e and %f |
%G | Shorter of %E and %f |
%i | Signed Decimal Integer |
%n | Platform-independent Newline |
%o | Unsigned Octal Integer |
%s | String |
%u | Unsigned Decimal Integer |
%w | Term |
%x | Unsigned Lowercase Hexadecimal Integer |
%X | Unsigned Uppercase Hexadecimal Integer |
Picat has an interface with C, through which Picat programs can call deterministic predicates that are written as functions in C. C programs that use this interface must include the file "picat.h" in the directory Picat/Emulator. In order to make C-defined predicates available to Picat, users have to re-compile Picat’s C source code together with the newly-added C functions.
Picat’s C interface provides functions for accessing, manipulating, and building Picat terms. In order to understand these functions, users need to know how terms are represented in Picat’s virtual machine.
A term is represented by a word that contains a value and a tag. A word has 32 bits or 64 bits, depending on the underlying CPU and OS. The tag in a word distinguishes the type of the term.
The value of a term is an address, except when the term is an integer (in which case, the value represents the integer itself). The location to which the address points is dependent on the type of the term. In a reference, the address points to the referenced term. An unbound variable is represented by a self-referencing pointer. In an atom, the address points to the record for the atom symbol in the symbol table. In a structure, f(t1,…,tn), the address points to a block of n + 1 consecutive words, where the first word points to the record for the functor, f/n, in the symbol table, and the remaining n words store the components of the structure. Arrays, floating-point numbers, and big integers are represented as special structures. Picat lists are singly-linked lists. In a list, [H|T], the address points to a block of two consecutive words, where the first word stores the car, H, and the second word stores the cdr, T.
A C function that defines a Picat predicate should not take any argument. The following function is used in order to fetch arguments in the current Picat call.
The following functions are provided for testing Picat terms. They return PICAT_TRUE when they succeed and PICAT_FALSE when they fail.
The following functions convert Picat terms to C. If a Picat term does not have the expected type, then the global C variable exception, which is of type Term, is assigned a term. A C program that uses these functions must check exception in order to see whether data are converted correctly. The converted data are only correct when exception is (TERM)NULL.
The following function registers a predicate that is defined by a C function.
The first argument is the predicate name, the second argument is the arity, and the third argument is the name of the function that defines the predicate. The function that defines the predicate cannot take any argument. As described above, picat_get_call_arg(i,arity) is used to fetch arguments from the Picat call.
For example, the following registers a predicate whose name is "p", and whose arity is 2.
The C function’s name does not need to be the same as the predicate name.
Predicates that are defined in C should be registered after the Picat engine is initialized, and before any call is executed. One good place for registering predicates is the Cboot() function in the file cpreds.c, which registers all of the C-defined built-ins of Picat. After registration, the predicate can be called. All C-defined predicates must be explicitly called with the module qualifier bp, as in bp.p(a,X).
Consider the Picat predicate:
where the first argument is given and the second is unknown. The following steps show how to define this predicate in C, and how to make it callable from Picat.
Precedence | Operators |
Highest | ., @ |
** (right-associative) | |
unary +, unary -, ~ | |
*, /, //, /<, />, div, mod, rem | |
binary +, binary - | |
>>, << | |
/' | |
^ | |
' / | |
.. | |
++ (right-associative) | |
=, !=, :=, ==, !==, =:=, <, =<, <=, >, >=, ::, in, notin, =.. | |
#=, #!=, #<, #=<, #<=, #>, #>=, @<, @=<, @<=, @>, @>= | |
#~ | |
#/' | |
#^ | |
#' / | |
#=> (right-associative) | |
#<=> | |
not, once, ' + | |
, (right-associative), && (right-associative) | |
Lowest | ; (right-associative), || (right-associative) |