These exercises develop some useful utility functions for working with different aspects of the Semantic Web.

Exercises

SW-1: Name conversion

Function names: camelize, hyphenate

Test cases: microdata-exs-tests.lisp.

Semantic web bundle

A common problem in many languages is mapping name strings from one camelCase to hyphens or back. For example, in Javascript, names are camelCase, but CSS is hyphenated, so the CSS attribute font-size has to be written as fontSize in Javascript.

For us, this occurs when reading Semantic Web names which use camelCase, e.g., JobPosting, but we want a Lisp symbol like job-posting.

Define (camelize string [capitalize]) to take a hyphenated string and return the camelCase equivalent. If the optional second argument is given and is true, then the first letter should be capitalized.

Define (hyphenate string [case] ) to take a camelCase string and return the hyphenated equivalent. If the optional second argument is :upper (the default), then the result string be all upper case. If it is :lower, it should be all lower case. Any other value is an error.

Only insert a hyphen when the case changes. Something like "getURL" should become "GET-URL" not "GET-U-R-L".

SW-2: Microdata Reader

This exercise has been retired, in favor of JSON-LD

SW-3: JSON-LD reader

Function name: (read-json-ld url-string)

Submit this in the Code Critic under SWP 1: Semantic Web Personal 1

Your somewhat open-ended job in this exercise is to write a function that can take a URL, get the JSON-LD stored on that page, if any, and return a list of the entities defined in the JSON-LD as a list of frames, using the format for a frame used in class:

(instance-id (abstractions*) attributes*)

There are no test cases, but your code should handle at least the fairly complex example at https://www.foodnetwork.com/recipes/food-network-kitchen/honey-mustard-dressing-recipe-2011614

What to read

The point of JSON-LD is that semantic data can be embedded in a web page in an easy to retrieve location, using the standardized schema.org ontology, in a relatively easy to parse JSON structure. Everything uses standard web technologies, supported by many programming languages.

Libraries

For this exercise, you will use several Common Lisp libraries, that can be loaded with QuickLisp:

Lisp implementation-specific notes

LispWorks: You need to tell LispWorks to allow international characters in strings, since they occur often on web pages. You do that by executing:

(lw:set-default-character-element-type 'cl:character)

This needs to be re-done every time you start Lisp. If you want, you can put it in your init file.

Allegro Express / MacOS: I'm currently getting an SSL initialization error with HTTPS URLs. This may be a local issue with my SSL installation, although LispWorks SSL works.

To install these three libraries:

(ql:quickload "drakma")
    (ql:quickload "cl-json")
    #-allegro (ql:quickload "cl-html-parse")

The #-allegro tells the Lisp reader to skip the expression that follows when running in Allegro. The HTML library is already in Allegro and does not need to be installed.

With these libraries installed,

Test all of these functions by hand before going any further to make sure they work properly. If not, post to Campuswire. Include what Lisp and operating system you are using, and the exact input and output.

Testing and submitting

Test your code on several recipes at Food Network. Look other recipes on other sites.

Test your code on a non-recipe site. According to current search engine statistics, over 40% of all websites include JSON-LD! How do you find them? One clue is when web search returns a rich snippet. For example, if you search for music concerts, and you get a list of labeled events with critical details, then the source pages for those events probably have JSON-LD. Also see JSON-LD notable examples.

Submit all the functions you defined to do this task. Include the URLs you tested it on. Note issues and problems that you could and could not solve.