This is a static copy of the main wikispot.org site, preserved for historical purposes only. Please see this page for more information.

Semantic Wiki

InfoInfo TalkTalk
Search:    

What is a Semantic Wiki?

  1. What is a Semantic Wiki?
    1. Importance of the Semantic Wiki
    2. Requires no Extra Effort
    3. Use cases
    4. Unified Syntax Examples
      1. Wiki Page Links
      2. Literal Values and Offsite Links
  2. Sycamore Plugin
    1. Database Needs
      1. Tables
        1. curPages
        2. namespace
        3. type
        4. metadata
    2. Interface Needs
      1. Edit Interface
      2. Search Interface
    3. Code Organization
  3. Other Semantic Wiki Projects
  4. Semantic Web Links
  5. User interface
    1. Mockup
  6. Ultimate goals

The word [WWW]Semantic is related to the meaning carried by any communication in words, code, drawing, anything. The idea of Semantic Wiki (or semantic web) is to write or create information in a way that make it easy to process and use by software agents. Many people use the term Semantic data and metadata interchangeably. In the case of Semantic Wiki it means to giving richer meaning to data managed via a user-maintained website or wiki. Both mean information about information. Semantic data can be expressed in a set of exactly three pieces of data. Together these pieces are called statements.

A simple example of semantic data is HTML's 'bold' tag. A word, paragraph, or sentence can be put between an opening <b> and a closing </b>. In most cases this just means the text in the middle is drawn in italics when it's viewed. But some software will use the tag to otherwise mark the information as bold. For example, screen readers for the blind can change tone or inflection for <b> tags. In short, it's not directly drawn to the screen, but the technology in the middle can help make that un-drawn information useful.

Another example of semantic data is the "address" of a place. We can make a wiki page for a place, like Sam's Pizza, but then we'd like to give it an address. We can simply write the address on the page, or we can use the [[address]] macro to identify the address of the page in a way that's more meaningful to Sycamore. Because of this, the address of the page gets plotted on a map. The idea behind the semantic effort is to allow us to express other ideas, like "address" or "category" in a way that is understandable to computers, while being painless for people.

Importance of the Semantic Wiki

When data has meaning to both a machine and a person wonderful things can happen. What social networking sites like [WWW]MySpace have done to the dating scene and to personal relationships, the semantic web will do to all electronically stored information. You will be able to submit metadata queries like show me all objects of type "restaurant" that are open after midnight less than 5 miles from my home. And, seriously, don't you want a god damn hamburger at 2AM?

Requires no Extra Effort

Wiki's are user-maintained. This means that user's input data and maintain the relationships between other data and pages on the site. The best place to give meaning to data is at the source. When a user is adding a phone number for a restaurant they give it a name (usually "phone number") and a value. So they are already providing semantic data about that information it's just a matter of making it easy to store. The "subject" of the statement can be extracted from the current page.

Use cases

When talking about this, it's good to have an idea of what we'd like to be able to use this for. Here's some potential uses that wiki users would like (and have devised ways around, in some cases):

Integrated w/ mapping, a'la a map for [davis]Apartments. Map of all pages with a given tag / value.

Unified Syntax Examples

see /API for one possible inplementation

Possible formats for either a relation or attribute (the unified syntax):

Wiki Page Links

The following examples link a wiki page to another wiki page and gives that link a specific meaning.

Literal Values and Offsite Links

The following examples link a wiki page to an off-site URL or describes something about the current page.

With this form, we can explain the idea to newcomers as a way of "tagging the page with a value," in a sense. Since each predicate has a pre-defined type (managed via a special page) there is no need to explicitly state the type.

Another motivation behind this format is that on flickr people began tagging images with non-tag-like data, such as geocoordinates using a technique known as machine tagging. People started using a format very similar (geo:long=123.456), and flickr ended up supporting this format and allowed querying based upon it. Check out [WWW]this discussion for more. (In our case, we'd have geo long := 123.456)

Sycamore Plugin

There is interest in a Sycamore Plugin that will bring Semantics to the local wiki world. This development effort is currently in the planning phase.

Database Needs

semantic_db.pngCopy source XML into [WWW]this editor to view the schema details.

The [WWW]Semantic MediaWiki plugin is relatively complete and has been used as a reference for analyzing database requirements. We might consider changing from their standard a bit. I propose using the following database structure.

Source ERD file: metadata.xml
A [WWW]demo of the ERD Editor is available also.

Tables

curPages

This table is already a part of the Sycamore schema. It is shown here for illustrative purposes. Here are some sample rows:

+--------------------+
| pagename           | ...
+--------------------+
| East Sacramento    | ...
+--------------------+
| Sacramento         | ...
+--------------------+
| California         | ...
+--------------------+

namespace

This table provides a way to store multiple namespaces (or vocabularies) in a single wiki. The most used entries in this table are class, literal, and wiki. It is a good idea to try to use other namespaces like foaf, dc, etc but it should be up to the community to enforce these rules. These namespaces will likely be imported on initial installation (or upgrade). There doesn't necessarily need to be a way to add namespaces (yet).

+-----------+---------------------------------------------+
| alias     | uri                                         |
+-----------+---------------------------------------------+
| class     | http://wikispot.org/class/                  |
+-----------+---------------------------------------------+
| literal   | http://wikispot.org/literal#                |
+-----------+---------------------------------------------+
| wiki      | http://sacramento.wikispot.org/             |
+-----------+---------------------------------------------+
| dc        | http://purl.org/dc/elements/1.1/            |
+-----------+---------------------------------------------+
| dcterms   | http://purl.org/dc/terms/                   |
+-----------+---------------------------------------------+
| wikipedia | http://en.wikipedia.org/wiki/               |
+-----------+---------------------------------------------+
| rdf       | http://www.w3.org/1999/02/22-rdf-syntax-ns# |
+-----------+---------------------------------------------+
| rdfs      | http://www.w3.org/2000/01/rdf-schema#       |
+-----------+---------------------------------------------+

type

This table is for associating a particular data type to a "name" or "predicate". An example predicate is a phone number. In the example below the predicate class:phone is of type integer. Phone numbers can be stored as just numbers (ie 9161234567) and they can be parsed into a readable format (ie (916) 123-4567) as needed. The following types are expected to be supported:

+-------+----------------------------+
| alias | name            | type     |
+-------+----------------------------+
| class | phone           | integer  |
+-------+----------------------------+
| class | neighborhood_of | wikipage |
+-------+----------------------------+
| class | city_in         | wikipage |
+-------+----------------------------+
| class | tag             | string   |
+-------+----------------------------+
metadata

This table defines metadata for pages stored in the wiki. This table follows the subject-predicate-object rule because the pagename is always the subject.

+----+-----------------+-----------------+-----------------+----------------+--------------+
| id | pagename        | predicate_alias | predicate       | object_alias   | object       |
+----+-----------------+-----------------+-----------------+----------------+--------------+
| 1  | sacramento      | class           | city_in         | wiki           | California   |
+----+-----------------+-----------------+-----------------+----------------+--------------+
| 2  | east sacramento | class           | neighborhood_of | wiki           | Sacramento   |
+----+-----------------+-----------------+-----------------+----------------+--------------+
| 3  | east sacramento | class           | tag             | literal        | neighborhood |
+----+-----------------+-----------------+-----------------+----------------+--------------+

Interface Needs

This feature requires some changes to the user interface.

Edit Interface

The edit interface might need to be modified for certain data types like location, areas, etc. This could just be css/dom javascript magic to insert the content into the edit textbox.

Can an example be given of an interface for editing that allows these relationships to be expressed?

Search Interface

There needs to be an advanced search option that allows results to be returned based on relations and attributes. The search's options could be based on the items searched for — for instance, a "within x miles of.." operator when you're interested in searching based on "location". We could tailor the interface to allow you to select a number of <types> to base your search on, and each type has a different way of allowing you to query it.

The types could be pluggable, so each type could have two python files associated with it. One that would tell us how we want to query the database for the information we want, and the other telling us how to represent the information to the user once we've gotten the search results (as well as how to represent the information when it's on a page — e.g. an address should be linked to the map associated with the address point).

Code Organization

This feature should be able to be implemented as a plugin but it does require database changes and extra search capabilities. There will need to be a few custom pages including one to allow adding entries to the type table.

See /Caching for thoughts about making this fast.

Other Semantic Wiki Projects

There are other projects that are relatively far a long in the development process. One to watch for is Semantic MediaWiki which already has some [WWW]early adopters.

Semantic Web Links

User interface

The UI for all of this is really important. We need to keep it just as easy as it is now for people to change addresses, phone numbers, and so forth.

metadata_button.pngOne option would be to have a button that reads 'Metadata' in the edit area. metadata_edit.pngOne option for editing metadata.

Or maybe we should show all the metadata stuff in the normal editor interface, keeping everything in the same place. We could also make it so that wherever we display the data we allow it to be clicked on and edited, inline-style.

There are a few different ways of dealing with the display of the metadata. One is to embed the data directly into the page content, in the same way things like links and macros are embedded in a page's text. All data entered would be displayed right where it was entered.

Another way is to have data entered into the page's body (or via a separate interface), but then only displayed when it was signaled by some sort of [[get(value)]] macro. This has the advantage of ultimate control over the presentation of the information. It has the drawback of making it potentially confusing to change the data — how does the average joe know how to change the address of the page? What used to be just "300 Main Street" is now [[get(address)]]. This confusion could probably be mitigated by displaying the metadata fields right in the edit interface, and when initiating a [wikispot]quick edit on the area, somehow entering into an edit for just that metadata field..

Another way is to have the data entered and displayed in a way that's separated from the page's body (e.g. at the bottom of the page). This has the advantage of making it easy to see where to change the information — you change it right where it's displayed — but it has the disadvantage of not allowing for careful control over where the data is displayed, and being redundant (the address will still probably be entered on the page). We could still allow for something like [[get(value)]] in this case..

Mockup

UI mockup goes here

Ultimate goals

Consistent with the goals of the Sycamore project, the aim of the semantic feature will not be to produce a proof of concept system for semantic research, but rather a solid, easy-to-use semantic system that will help us access information more easily.

Questions

Note: You must be logged in to add comments

Can you explain the purpose of the semantic_datatype table? It seems like it's supposed to be some sort of dispatch. The regexp matches and then that tells us what the type of the statement is, and we use that how in this model? (Basically, why doesn't the semantic_datatype associate itself with the predicate?)


2007-03-20 13:12:33   semantic_datatype: I've added some notes about my thinking. —Sc0ttBeardsley


2007-03-20 17:56:15   Why is there no relationship between the semantic_attribute table and the semantic_relation table? After we create a datatype and place it into the semantic_attribute table, doesn't that datatype get associated with the predicate of the semantic_relation table? I can define a phone number's form and say that it is a phone number, but then when I encounter another phone number I'd like to use the same row from semantic_attribute, as the predicate is the same. (Though, I notice the semantic_attribute is tied to a specific page, too. I suppose I need more clarification as to its purpose. I know it's for identification of types, but I'm not sure how it's fitting in.) —75.31.44.27


2007-03-20 20:00:11   The semantic_attribute table is said to "[define] attributes for objects stored in the wiki", and has subject, predicate, and object. But so does the semantic_relation table? What's the purpose of that table? —75.31.44.27


2007-03-20 21:05:08   I think the attribute table needs to be changed. the predicate should be able to be one of the standard predicates in DC, FOAF, etc. I'd like it also to be a custom datatype also. The semantic_relation table is for showing relations between two real world things. One of those things is represented as a page in the wiki the other can be another page in the wiki or some other real world thing. For example "Shakey's Pizza" "Is A" "Restaurant" the "Shakey's Pizza" has the local wiki's namespace so it represents the page in the wiki. —Sc0ttBeardsley


2007-03-20 21:14:27   So what you had in mind is that the relationship is always between a page and another object? —75.31.44.27


2007-03-20 21:18:20   Yes, generally a page and another object... the page can either be referenced in the subject or the object... The database structure would allow storing relationships between two non-page objects but that's not all that interesting for our purposes. —Sc0ttBeardsley


2007-03-20 23:32:38   To recap relation vs attribute: relations connect two objects and attributes connect an object to a literal (like a date/number/string). There is a good discussion about this on [WWW]MediaWiki's blueprint page. —Sc0ttBeardsley


2007-04-20 03:06:49   I like the simplier schema. We should talk a bit about markup and UI. [Established:=Date("1990-01-01")] versus Date Established := 1990-01-01. If we say a type can only be one word then the latter markup would work well, I think?

We could also opt to have this as a somewhat disjoint UI from the normal editing interface (a "Metadata" button?). I'm really not sure how the UI for all of this ought to work, but I think it's actually the most important part, as we have to make this easy to use (otherwise it won't be used). —PhilipNeustrom


2007-04-20 03:46:40   re: metadata button: I was thinking of just embedding it into the text of a page. The metadata wouldn't have to be displayed by default. Say you have a macro called Metadata that takes 4 args (predicate_alias,predicate,value,display_flag). This macro would add an entry to the metadata table, then if the display_flag is set it would display the metadata in a predefined format (based on the type of the predicate). For example I have a restaurant page with the following macro call: Metadata(class,tag,expensive). This would add the metadata about this restaurant being expensive but it would not display that information on the final page. As far as getting a list of valid predicate_aliases and predicates, yes it might be nice to have some sort of tool (ajax?) that will pull up the already defined vocabularies. It is important to have something like this because we want everyone speaking the same language. —Sc0ttBeardsley


2007-04-20 04:17:34   A comment about types: I think we should offload the type of a metadata item onto another special page. Instead of having the page editor define the type of a metadata item inline it should be a separate procedure. This makes it slightly more difficult (not impossible) to add new vocabulary words. The goal is to get people to use a small set of words to describe data. If we tie type to the predicate (aka keyword aka name) elsewhere then we can both limit the syntax required for the page editors while still knowing what type of data they should be entering. So the special page will allow page editors to add a new vocabulary word (and it's type) on the fly. This will essentially be an interface to the type table. —Sc0ttBeardsley


2007-04-21 01:11:51   Ahh yes, I see where you're going Philip. I kinda like the separate metadata interface. I think there would have to be a drop down menu for the "tag" field as it is labeled in the UI screenshot. We should talk more about this though. —Sc0ttBeardsley


2007-04-21 12:23:12   You could have the "metadata" entered directly on the page in some sort of metadata block. That block could have a bunch of display options including "hide". This way the user would he able to use [[get(address)]] if they needed and not bet redundant. In the Confluence wiki system the metadata "block" is simply a macro with a body:

[[metadata(hide)]] 
name = Joe
phone = (555)555-5555
address = 123 Main Street
[[metadata]]
StephenDay


2007-04-21 23:27:53   ya, that should work, but I'm worried about how we'll be adding a new tag/name though. I guess it'll be OK to just allow new tags and just set their type to the default (which is as a string). Since every tag/name has a specific type you'll want a way to define this when adding the tag/name. Also, just a note on why every tag/name should have a specific type: this will enable a uniform display of that name. So phone numbers will always be displayed as: +1 (XXX) XXX-XXXXSc0ttBeardsley


More ideas about using a macro with a body (or some king of in-page metadata block):

[[metadata(display=False)]]
|| name || Joe || firstName ||
|| phone || (555)555-5555 || phoneNumber ||
|| address || 123 Main Street || streetAdress ||
|| number of pets || 14 || int ||
[[metadata]]

Most users will already know how to make tables. If they don't put in the last column, just use a default, which could be derived from the name in many cases. —StephenDay


2007-04-23 03:06:35   Given the schema here, how would we search for something based on location without getting all of the locations in the database and scanning through them? Will string sorts/indexes be sufficient for our efficiency purposes, given that we keep all values in a consistent format in the DB? In what cases would we run into problems with this approach? This is just an example problem. —PhilipNeustrom


2007-04-23 03:27:14   You bring up a good point Philip. It would be best to store spatial data using spatial extensions to MySQL/Postgres. Do you have any suggestions? Perhaps a separate table for each type?Sc0ttBeardsley


2010-02-15 23:39:53   Notes on derived metadata sets...

Instead of single fields start with a base metadata form that consists of things that will need to be tracked about all pages:

Then from this create some metadata that uses this base, but adds groups of appropriate fields. At the first tier it is just separating pages out into general classes, but eventually you'd have (base -> business -> restaurant -> sushi) with each level adding appropriate new fields.

Fields should allow radio buttons, checkboxes, text, numeric, regex validated, range (for prices), date time, time matrixes (for open hours), photos. each field should track if it is required or not, and allow for a default value.

In the same way that people can now create templates allow for the creation of metadata sets. only admins will be able to remove fields from a set because doing so would remove that data from all pages that use that metadata set. alternatively have a central library of metadata that any wiki can subscribe to.

It becomes very easy to produce reports off the information. (base -> business -> contractor -> painting) for instance would allow any page using that set to see at a glance all of the contractors, their license numbers, and their price ranges and ratings automatically. It should report for the local wiki first and then report separately for nearby wikis with an option to view the same information across all wikis. —JasonAller

This is a Wiki Spot wiki. Wiki Spot is a 501(c)3 non-profit organization that helps communities collaborate via wikis.