These exercises develop some useful utility functions for working with different aspects of the Semantic Web.

Exercises

SW-1: Name conversion

Function names: camelize, hyphenate

Test cases: microdata-exs-tests.lisp.

Semantic web bundle

A common problem in many languages is mapping name strings from one camelCase to hyphens or back. For example, in Javascript, names are camelCase, but CSS is hyphenated, so the CSS attribute font-size has to be written as fontSize in Javascript.

For us, this occurs when reading Semantic Web names which use camelCase, e.g., JobPosting, but we want a Lisp symbol like job-posting.

Define (camelize string [capitalize]) to take a hyphenated string and return the camelCase equivalent. If the optional second argument is given and is true, then the first letter should be capitalized.

Define (hyphenate string [case] ) to take a camelCase string and return the hyphenated equivalent. If the optional second argument is :upper (the default), then the result string be all upper case. If it is :lower, it should be all lower case. Any other value is an error.

Only insert a hyphen when the case changes. Something like "getURL" should become "GET-URL" not "GET-U-R-L".

SW-2: Microdata Reader

This exercise has been retired, in favor of JSON-LD

SW-3: JSON-LD reader

Function name: (read-json-ld url-string)

Submit this in the Code Critic under SWP 1: Semantic Web Personal 1

Your somewhat open-ended job in this exercise is to write a function that can take a URL, get the JSON-LD stored on that page, if any, and return a list of the entities defined in the JSON-LD as a nested list structure. See the testing subsection below for some URLs to test on.

What to read

The point of JSON-LD is that semantic data can be embedded in a web page in an easy to retrieve location, using the standardized schema.org ontology, in a relatively easy to parse JSON structure. Everything uses standard web technologies, supported by many programming languages.

Libraries

For this exercise, most of the work is done using several Common Lisp libraries that can be loaded with QuickLisp:

Lisp implementation-specific notes

LispWorks: You need to tell LispWorks to allow international characters in strings, since they occur often on web pages. You do that by executing:

#+lispworks (lw:set-default-character-element-type 'cl:character)

This needs to be done every time you start Lisp. If you want, you can put it in your code file or your init file. The #+lispworks says "only do the next expression in Lispworks".

To install these three libraries:

(ql:quickload "dexador")
(ql:quickload "cl-json")
#-allegro (ql:quickload "cl-html-parse")

The #-allegro says "do not do the next expression in Allegro". The HTML library is already in Allegro and the portable version should not be installed.

With these libraries installed,

Test all of these functions by hand before going any further to make sure they work properly. If not, post to Piazza. Include what Lisp and operating system you are using, and the exact input and output.

Testing and submitting

There are no test cases, but your code should handle at least the following URLs:

Look for a few other types of sites, such as organization home pages. In all cases, you should see list structures with Scheme tags such as @TYPE. For readability, you may want to use PPRINT to pretty-print the list output.

Submit all the functions you defined to do this task. Include the URLs you tested it on. Note issues and problems that you could and could not solve.