Symbols, Concepts, and Facts

Programming vs AI Programming

ProgrammingAI Programming
numbers units, measures
strings symbols, concept hierarchies
arrays, tuples lists, facts, trees, graphs
key-value objects frames, triples
dates events, temporal relations
=, <, > similarity, pattern matching
loops recursion, search (breadth-first, depth-first, best-first, ...)
imperative functional, declarative

Programming vs AI Programming

The Big Difference

  • All knowledge about a domain is in data, not code.
    • This is true in Machine Learning programming too.
  • Code only knows about the structure of data
    • How properties are stored
    • How "is a" relationships are stored
  • Code has no references to any domain concept, e.g., "bird" or "animal" variables or classes

Symbols

All programming languages have symbols.

int sum = x + y;

int, sum , x, and y are symbols. So are = and +.

Symbols are never in string quotes “...”.

Code won’t work unless the symbol X is the same X every time you use it.

Symbols are often interned into namespaces.

Symbols in Code

The compiler for every language uses symbols. They're really useful.

Selfishly, few languages besides Lisp let you easily use them for your own code.

  • Scheme, Ruby, ... ?
  • JavaScript has symbols but they are not meant for normal code.

Symbols in AI

Facts

Tweety is a canary. A canary is a bird. Chilly is a penguin. A penguin is a bird. A bird is an animal.

How do we represent these so a computer can understand?

Semantic triples

subject relation object.
tweety is-a canary. canary is-a bird. chilly is-a penguin. penguin is-a bird. bird is-a animal.

Triples are the “binary representation” of knowledge networks. There’s a W3C standard for them.

Use symbols for subjects, relations, and objects.

Triples in Lisp

((tweety isa canary)
 (canary isa bird)
 (chilly isa penguin)
 (penguin isa bird)
 (bird isa animal))

Frames

Organize facts by subject.

TriplesFrames
((tweety isa canary)
 (canary isa bird)
 (canary color yellow)
 (bird isa animal)
 (bird airborne yes)
((bird (animal) (airborne yes))
 (canary (bird) (color yellow))
 (tweety (canary)))

Knowledge graphs

Triples with common subjects and objects form a knowledge graph.

The subjects and objects are the vertices or nodes of the graph.

The relations are the labeled edges of the graph.

Knowledge Graphs vs LLMs

  • Large Language Models have vector embeddings.
  • A vector embedding of a word is a large tuple of numbers. The goal in training is to give that words that appear in similar contexts tuples that are very close by some distance metric.
  • E.g., "king" and "queen" are very near in vector space.
  • Does this capture ontological knowledge? Sort of
  • Can knowledge graphs be combined with LLMs? yes

IS-A trees

tweety is-a canary. canary is-a bird. chilly is-a penguin. penguin is-a bird. bird is-a animal.

Inheritance

tweety is-a canary. canary is-a bird. chilly is-a penguin. penguin is-a bird. bird is-a animal. bird airborne yes. penguin airborne no.

Inference

Is Tweety an animal? Can Tweety fly? Can Chilly fly?

Multiple inheritance

Is Tweety an animal? Can Tweety fly?

Nixon diamond

Nixon is-a Republican. Nixon is-a Quaker. Republican is-a person. Quaker is-a person. Republican pro-war yes. Quaker pro-war no.

Linearization

A well-defined order for searching the abstractions of a concept for a property. Stop with first answer found.

A linearization contains the concept and all concepts above it in an ontology, such that no concept appears to the left of a more specific concept.

OK: tweety canary bird pet animal
OK: tweety pet canary bird animal
NOT OK: tweety canary bird animal pet

Linearization

Simple algorithm

  • Append the recursively generated linearizations of a concept's immediate abstractions
  • Remove all but the last occurrences of repeated elements
  • Return concept + the de-duplicated list

The Pedalo Paradox

> (get-prop 'pedal-wheel-boat 'nav-zone)
(DAY-BOAT NAV-ZONE 5)

> (get-prop 'small-catamaran 'nav-zone)
(DAY-BOAT NAV-ZONE 5)

> (get-prop 'pedalo 'nav-zone)
(WHEEL-BOAT NAV-ZONE 100)

Evolution of Linearization