|
Frequently Asked Questions about OpenCyc
Version 0.7b
|
|
E-Mail Comments to: opencyc-doc@cyc.com
Most of the questions on this page are answered in the documents found on the OpenCyc documentation page, so you may want to check there before trying to make sense of the answers provided here.
These questions were taken from the OpenCyc IRC logs and posts on the SourceForge OpenCyc forums. Click on a question to see its answer.
Features of OpenCyc
Getting Started with OpenCyc
Definitions
OpenCyc Application Development
The OpenCyc Community
OpenCyc Licensing and Project Goals
Links to External Reading
What is OpenCyc?
OpenCyc is the open source version of the Cyc technology,
the world's largest and most complete general knowledge base and commonsense reasoning engine.
OpenCyc contains the full set of (non-proprietary) Cyc terms as well as millions of assertions
about the. Cycorp offers this ontology at no cost and encourages you to make use of it as you see fit.
What is included with the first release of OpenCyc?
Release 1.0 of OpenCyc includes the following:
-
The entire Cyc ontology containing hundreds of thousands of terms,
along with millions of assertions relating
the terms to each other, forming an upper ontology whose domain is all of human
consensus reality.
-
English strings corresponding to all concept terms, to assist with
search and display.
-
A compiled version of the Cyc Inference Engine and the Cyc Knowledge Base Browser.
-
Documentation and self-paced learning materials to help users
achieve a basic- to intermediate-level
understanding of the issues of knowledge
representation and application development using
Cyc..
-
A specification of CycL, the language in which Cyc (and hence OpenCyc) is written. There are CycL-to-Lisp, CycL-to-C, etc. translators.
-
A specification of the Cyc API, by calling which a programmer can build an OpenCyc application with very little familiarity with CycL or with the OpenCyc KB.
-
Links between Cyc concepts and WordNet synsets.
What open source extra programs will be included with Release 1.0?
OpenCyc Release 1.0 includes several open source programs along with the knowledge base and the knowledge server. These will tentatively include:
-
An ontology exporter to selectively export OWL files
-
Semantic Web Server supporting DAML queries (Java)
-
Inference graphing program (Java)
-
Java version of the Cyc API (Java)
Will the inference engine be open source? Why (not)?
Not 100% at this time. Cycorp intends to sell premium products and services using its inference engine, which has been designed to work with the Cyc Knowledge Base in an optimal fashion. Cycorp provides a reference Cyc Server executable for Intel-based Linux and for Windows 2000 along with the release. This includes a basic implementation of its inference engine, without source code, but with a irrevocably free license. You are free to use this engine in free or commercial applications with a guarantee of no license fee ever.
What can one use OpenCyc to do?
OpenCyc can be used as the basis for a wide variety of intelligent applications such as
-
speech understanding (using the KB to prune implausible choices via common sense, discourse context, and prosodics)
-
database integration (using the KB as an interlingua through which semantic joins occur automatically via back chaining) and consistency-checking
-
rapid development of an ontology in a vertical area (by extending and growing the OpenCyc KB in that area, using the OpenCyc Rapid Theory Formation toolkit)
-
email prioritizing, routing, summarizing, and annotating
to name just a few. If you have ideas or suggestions, you can send them to us at opencyc-doc@cyc.com or discuss them on the OpenCyc discussion forum on SourceForge.
Will people be able to develop their own browsers or other tools to access the knowledge base?
The provided Cyc Server implementation has a socket-based API, which developers can use to create applications that access the knowledge base for browsing, editing, and inference purposes.
How do I find the OpenCyc IRC Channel?
Download a free irc chat program (e.g. mirc), go to irc.freenode.net, and
type the command '/join #opencyc'.
Where are the OpenCyc IRC Logs?
http://tunes.org/~nef/logs/opencyc/
Is there a public server where I can browse the KB before I download it?
Yes. The list of available servers is here. The homeLinux sites are running Cyc servers for people to look at.
The KBs that are available here are just scratch versions, without assert privileges. You probably wouldn't typically put this interface directly on the web. You could consider it like an administrator interface. You would put something else in front (your app, whatever that happens to be).
What are the hardware requirements for running OpenCyc?
A Pentium, Athlon or equivalent running Windows 2000, Windows NT, or Linux, with at least 512 megabytes of RAM. Performance of the knowledge base is proportional to the speed of the processor.
What is OpenCyc.org?
OpenCyc.org is the organization that will administer the ongoing releases of OpenCyc. W
hile the first several releases of OpenCyc will consist entirely of contributions from Cycorp, Inc.,
OpenCyc.org will ultimately certify and incorporate the contributions of many other open source
contributors as well.
What does Cycorp mean by saying that the knowledge base will be "open source"? Will it be publicly available? Will it be free?
Yes, OpenCyc may be freely copied, distributed and used for commercial or non-commercial purposes according to the terms of the OpenCyc license. OpenCyc is currently released under the Appache License, Version 2. (More information on OpenCyc licensing can be found here.) "Source code" in this license refers to the CycL assertions in the OpenCyc Knowledge Base. Qualified parties can obtain a free license to a substantially larger subset of the Cyc Knowledge Base known as ResearchCyc which is for R&D use only. The complete Cyc Knowledge Base can be licensed from Cycorp, Inc. for commercial use. Terms for licensing the complete Cyc KB are negotiated on an individual basis. Year by year, each assertion in the latest version of Cyc will migrate to a subsequent release of ResearchCyc, and each assertion in ResearchCyc will migrate to a later release of OpenCyc.
Why is Cycorp giving away technology, and how does Cycorp benefit?
To establish Cyc as the standard for knowledge representation, for knowledge management,
for data base integration, and in general for intelligent software applications. Also,
the release of OpenCyc will help lay the groundwork for the massively parallel effort to rapidly
grow the Cyc KB.
Are there other benefits to releasing OpenCyc?
OpenCyc will raise awareness for symbolic knowledge representation.
It will also create opportunities for combining symbolic and rule-based systems with
other technologies, such as neural networks, planning systems, machine learning and genetic algorithms.
Are there any negative outcomes you want to guard against?
There are two things we'd like to insure against that at first might seem to be at odds with each other.
First, we want to keep the KB from diverging, which occurs when changes are made in different
directions resulting in multiple different versions of the KB that are all actively in use.
One way to guard against divergence is to require that all core KB content remain open, because
everyone gets to see and adopt all changes to the core.
The other concern is that the same forced openness that helped prevent divergence
might cause commercial developers to reject the Cyc platform. After all, they may want
to sell a proprietary derivative product, so they don't want to be forced to expose all
of the valuable content they assembled.
How will knowledge be transferred on the OpenCyc-grid?
OpenCyc java source code includes agent adapters for both the CoABS and FIPA-OS
agent communities. CoABS is still proprietary (Global InfoTek from the Darpa CoABS project),
but FIPA-OS is open source and is one of a number of FIPA compliant agent systems. So the message
for CoABS and FIPA-OS is similar, and OpenCyc uses the FIPA ACL (Agent Communication Language) form --
serialized into XML (actually CycML) between agents. But we are looking at peer to peer as the
infrastructure for distributed OpenCyc, in that the agent communities have centralized directory
services, and peer to peer builds these on the fly. If the message content and performatives can
be made identical between agent community and p2p frameworks, then OpenCyc can bridge them when
appropriate. Think of the grid just as you would think of any of the networking implementations.
After the Knowledge Grid is implemented, when you launch OpenCyc or talk to an OpenCyc server,
you will be talking to all connected OpenCycs -- a semantic web.
What is an ontology?
There's a good definition at the Free On-Line Dictionary of Computing:
http://foldoc.doc.ic.ac.uk/foldoc/foldoc.cgi?query=ontology
An excerpt:
"An explicit formal specification of how to represent the objects, concepts and other entities that are assumed to exist in some area of interest and the relationships that hold among them."
One thing you're providing is a knowledge browser. Is this anything like a Web browser?
The provided Cyc Server implementation supports browsing and editing services via HTML.
Thus, any Web browser, such as Internet Explorer, Netscape, or Opera can be used for this purpose.
While a generic Web browser is used for browsing Web pages, the OpenCyc KB Browser is used for
browsing the contents of the OpenCyc Knowledge Base.
What is CycML?
The intended usage (CycML import/export is not yet available) is to have a sort of
knowledge exchange server where people submit CycML files and everyone can view and
download the files. The same files are great for both viewing and loading CycML.
XSL can be used to transform CycML into various HTML-based views and OpenCyc load utilities
can be used to load the CycML files into your copy of OpenCyc. You could even download packages
that include some CycML with SKSI assertions and a database or hash file and get them connected
up just by loading the CycML and dropping the database/file in the right directory. CycML will be
useful in both Java and SubL.
Is OpenCyc already compatible with apache?
Yes, it is compatible with apache but also has its own built-in http server (this is the default)
-- it can use either.
Which kernel versions, browser versions, etc. has OpenCyc been tested with?
We develop on Red Hat 7. The following have been tested and appear to work:
-
Linux: Red Hat 7, Mandrake 8
-
Windows: NT, 2000
-
Browsers: Netscape(Linux + windows), Mozilla (Linux), MS IE browsers (windows)
Is there a C API of any sort (network access, say?) yet?
No, but since Cyc is translated into C we may not be far from having the library/header files.
A text-based telnet link is available from any language which supports sockets programming, but
parsing answers would be the job of the calling program.
I'm interested in natural language processing and text understanding. Can you recommend a
representative bibliography?
-
Allen, Natural Language Understanding
Where can I find more information on SubL and how to map files into OpenCyc's ontology?
Cycorp has a little bit of information at
http://www.cyc.com/cycdoc/ref/subl-reference.html
and http://www.cyc.com/doc/white_papers/mapping-ontologies-into-cyc_v31.pdf.
You can also look up the Common Lisp hyperspec at
http://www.xanalys.com/software_tools/reference/HyperSpec/Front/index.htm.
It gives the Common Lisp functions from which SubL is derived. If a function is not defined in the
SubL programming document, then very likely it has the signature given by common lisp.
When we release source code for the Cyc Server, it will likely be SubL that you either load into
the command line or API, or compile into C using a translator built into Cyc itself... then compiling
the translated C code with the GCC toolchain and static linking to a Cyc object library that we will
provide. The accompanying C header file will cover the Cyc API (as translated into C).
How will OpenCyc be useful to the XML community?
The web is more amorphous and unstructured than necessary. In recognition of this, the XML community
is providing a powerful syntax for adding structure to the web. Cyc enhances XML by providing a
powerful universal semantics for modeling objects described via XML.
Together, qualitatively new and powerful applications will emerge based on machines understanding
that which was previously unfathomable.
How is OpenCyc made available to the public? Where can one get it?
The website http://www.opencyc.org will be the primary source
of information about OpenCyc. Downloads of OpenCyc will be available from http://opencyc.sourceforge.net.
I just finished downloading the OpenCyc KB. Now what do I do?
The best way to explore the KB is via an HTML browser. When you run OpenCyc, it starts up an HTML
server that you can connect to locally on port 3602 (connect to http://localhost:3602/cgi-bin/cyccgi/cg?cb-start).
It should display the login window. Log in as CycAdministrator.
You should see "Successful Login Welcome CycAdministrator! Your project is currently not set."
Now just type any old thing (one of say 5,000 things) into the Complete box in the top frame,
and it'll show you what it knows about that thing. Like, "animal" or something.
Read the
ReadMe file for OpenCyc. Also, visit
the Welcome page and click the link in the
left frame titled, "Enter Your First Knowledge".
I've just selected OpenCycProject as the project. Was that correct?
Yes. When you select OpenCycProject as the project, new constants/assertions you create will be not
only timestamped and creatorstamped, but also projectstamped with OpenCycProject.
I just downloaded OpenCyc and am trying to use the tools. I see an ASK tool. What would be a good
question to ask OpenCyc?
First, note that the Ask tool is deprecated and replaced by the Query tool. A simple question to
ask OpenCyc is (#$genls #$Dog ?X) in the EverythingPSC. In English, that's "What types of things are
dogs?"
Go to Tools, Query. Then put in (#$genls #$Dog ?X) in the formula field, and EverythingPSC in the
Mt: box
If the query works, you should see that dogs, amongst other things are Solid Tangible Things,
Spatial Thing - Localized and EukaryoticOrganism. My favorite one is #$HexalateralObject.
If you click on the [Explain] next to HexalateralObject, then click on GENLS, it will tell you why
it knows that dogs have six sides. Since you don't know (much) CycL, you can go to Opt and turn on
the "Show Assertions in English" option, and it will attempt to paraphrase the CycL into English.
Vertebrae have 6 sides, so eventually it reasons out that dog is a vertebrate. In English,
it would probably say something like "Every dog is a vertebrate, and every vertebrate has 6 sides".
How can I test the inference engine? Can you recommend a successful query?
Since there are so few rules in OpenCyc 0.6, all queries will be pretty simple until you add your
own rules.
Try this. Add a rule that says that every Cyclist is located in Germany. Go to Tools, Assert,
and enter:
(#$implies
(#$isa ?PER #$Cyclist)
(#$objectFoundInLocation ?PER #$Germany))
|
Now you have a rule. Asserting rules is just like asserting facts (GAFs). Now, using the
Query tool, ask where the CycAdministrator is.
What will the OpenCyc documentation consist of?
A library of HTML files and presentations covering: how to use the KB Browser; how to create
an ontology; the dos and don'ts of knowledge representation; the syntax and use of CycL; the contents of OpenCyc; and how to program using the Cyc API.
Several suggested orderings of the material will be provided reflecting the variety of experience
levels and learning objectives of OpenCyc users. For example, an experienced logician who is interested in learning about Cyc's truth maintenance mechanisms will follow a different path through the learning materials than a novice programmer who wants to understand how to add simple facts to the KB.
I have a big database (500meg) that I want to put into OpenCyc. Is OpenCyc big enough
to handle that?
Yes. But that's not the best plan. You should have OpenCyc access the database and relay,
not put the database directly into OpenCyc. If you're entering knowledge into OpenCyc, don't worry too much about memory usage; there are known approaches that may help with reducing this. But if you're entering data, stop. Keep the data where it is.
If you want to represent the contents of a billion-row database, then you will need to
take the HL module dynamic access and caching approach. But if you're just representing
knowledge, not data, then put it in OpenCyc and don't worry about memory.
Does OpenCyc not save its updates to disk? There's a thing about saving an image in the
README but that looks like you write a totally new image.
For now, writing a new image is how it is done. See Installation Instructions at
http://www.opencyc.org/doc/install/. "You can save the state of the OpenCyc world by
entering the form (write-image "path/name") before shutting down using (exit).
Edit run-cyc.sh, replacing world/latest.load with the path to your saved world.
(See the next question as well.)"
How can I save what I've added to OpenCyc? I've created my name constant a number of times
but each time I logout/exit the next time it doesn't remember and wants me to create it all over again.
After you add constants and assertions of your own, you can save out a new world.
Either in the SubL interactor (available from the Nav screen) or in the xterm
window where you started Cyc:
-
Type (write-image "world/mynewworld") [where 'mynewworld' is really any filename
you choose.
-
Quit out of Cyc. You can do this by typing '(exit)' [without the quotes but
with the parenthesis] in the window where you started Cyc.
The next time you start Cyc, don't use the './runcyc' script. Instead,
-
cd to the run directory
-
cd to the world directory
-
check that your 'mynewworld' file is in the world directory
-
cd .. [back to the run directory]
-
type: bin/latest.bin -w world/mynewworld
[again, 'mynewworld' is whatever filename you used]
Now the constants and assertions you added last time should be viewable from
the KB browser.
You can keep saving worlds out under new names if you want to keep a revision
history, and you can delete old worlds at any time.
I am trying to create a new constant, but I keep getting the following error:
HTML Transfer halted due to script error: Error on ioctl() call. What's going on?
The error is occurring when OpenCyc creates a new guid for the term. Of the several standard
methods for creating a guid, on Linux we use the one which involves the ethernet mac address.
So you currently must have an ethernet card installed on Linux to create constants with OpenCyc.
We plan to fix this bug to generate a random guid if an ethernet card cannot be found on the host
computer. However, at this point, if you do not have an ethernet card for internet connection,
you will be unable to create new constants. We think this is fixed in version 0.7.
What can the OpenCyc Planner be used for?
OpenCyc Planner + programming language = generated code.
Cycorp has the ambition of showing how OpenCyc can apply commonsense planning with programming language primitives to create programs meeting a user's high level requirements (goal state). Much of the Planner is scheduled to be ready by release 1.0. For the 0.7 release, we should have a small library of functions that can get planned out into several different languages. We may even get far enough to include programming statements.
The released vocabulary will contain methodForAction, whose consequent is the plan at that step (depth). The 0.7 release is not populated yet with any example axioms, but we're working on that for the next release.
Most planners are propositional. For example, you can ask "Does the monkey hold the stick?", but OpenCyc can be asked to "Prove that the monkey holds the stick or prove that he doesn't hold the stick." We use a derivative of the open source SHOP planner from UMBC. Of course OpenCyc is not complete (by complete, I mean that backchaining rules will not terminate when there are many rules or many values to instantiate, so when you run out of time, there is no answer) and is slower than a purely propositional planner. But for real world apps, you tune the queries with predicates such as highlyRelevantAssertion.
highlyRelevantAssertion could be a rule consequent as well. In this way, you can use OpenCyc to make OpenCyc smarter. We also have proof checker mode on asks, where the relevant rules can be specified when you know the general form of repeated queries. There are many ways to make problem inferences faster.
What is a "spindle" in Cycorp terminology?
A spindle is where you have many microtheories that all genl up to one microtheory. Below them all you have a query microtheory that can 'look up' and 'see' the entire spindle. This query microtheory is the bottom of the spindle and is empty, but it can 'see' everything above it, so it a good place to try to do a proof. Microtheories with names that end in "-PSC" are usually at the base of a spindle (PSC stands for Problem Solving Context). So the knowledge of which queries to ask resides in either a different microtheory or in the application which uses OpenCyc.
What is an ImplementationConstant? I've noticed that some of the constants in the OpenCyc KB are ImplementationConstants -- what does this mean?
ImplementationConstants are any constants that are not used to represent either common sense or other shared knowledge. Other shared knowledge would be like knowledge of some expert domain, like Nuclear Physics. So, ImplementationConstants are there to implement some application or capability. One spec is NLImplementationConstant.
How is #$forAll used?
#$forAll is Cyc's implementation of the universal quantifier in predicate logic.
#$thereExists is Cyc's implementation of the existential quantifier.
Any occurrence of a variable in an assertion in the Cyc KB that is not
explicitly bound by a quantifier is treated as if it were bound by
an implicit initial '#$forAll'.
Thus, a CycL assertion such as the following:
(implies
(isa ?CAT DomesticCat)
(eatsWillingly ?CAT Meat))
|
is interpreted as if it were written thus:
(forAll ?CAT
(implies
(isa ?CAT DomesticCat)
(eatsWillingly ?CAT Meat)))
|
(Of course, if there is more than one unbound variable in an assertion,
each is treated as if it were bound by its own #$forAll quantifier.)
Such implicit universal quantification makes working in CycL vastly more
convenient, but it does mean that if one accidentally omits an
existential quantifier from an assertion this may not be readily apparent,
as the variable that would have been bound by the omitted existential
quantifier is assumed to be universally quantified (rather than the
assertion being rejected as ill-formed). One's assertion will then not
have the meaning one expects. Consider, for example, the assertion that
every cat has a tail:
(implies
(isa ?CAT DomesticCat)
(thereExists ?TAIL
(and
(isa ?TAIL Tail-BodyPart)
(anatomicalParts ?CAT ?TAIL)))).
|
If I slip up and write:
(implies
(isa ?CAT DomesticCat)
(and
(isa ?TAIL Tail-BodyPart)
(anatomicalParts ?CAT ?TAIL)))
|
then the meaning of this assertion will be rather different (Exercise for
the reader: how?)
If universal quantification is implicit in the way described above, why
does CycL even contain the term #$forAll? This is because in certain
special cases it is necessary that a #$forAll operator be used
explicitly. Consider the following assertion:
(thereExists ?BELOVED
(loves ?LOVER ?BELOVED))
|
This states (roughly) that everyone loves someone or other. But
although everyone has a beloved in this scenario, the person loved may
differ from lover to lover. What if one wants to assert the stronger
statement that there is *some particular person* who is loved by everyone?
Then one needs an assertion such as the following:
(thereExists ?BELOVED
(forAll ?LOVER
(loves ?LOVER ?BELOVED)))
|
What does a yellow M mean in the web interface?
A yellow M next to an assertion in the KB Browser means that the assertion has meta-assertions stated about it. Click on the icon link (colored ball) to display the assertion and see a list of the meta-assertions. Look at the Key for Browser Icons for a list of icons used in the KB Browser and what they represent.
How many assertions must a constant have on it in order to be valid?
For an individual, you just need one: #$isa something
For a collection, you need one: #$genls something
For a predicate, you need one: #$genlPred something
Of course #$comment would be good on all of the above as well. For ordinary collections, think of genus and differentia. What does a term inherit and what makes it different from its siblings? That is the convention for dictionaries that we mostly adopt. So each collection has several genls: one main one (salient) and others which set it apart from siblings.
Is it okay if a new term's arg constraints are too general when you first create it? And you make them more specific later on?
Yes. The arg constraints of all terms are basically set at what's remotely possible, not what's most likely.
What is the proper format for attaching a comment to a constant? (#$Comment this is a comment)?
No. The correct format for attaching a comment to a constant is:
(#$comment #$NorthernHemisphereMt "This is the comment for NorthernHemisphereMt")
|
Can OpenCyc reason about beliefs and other propositional attitudes?
The full Cyc knowledge base contains various general rules
about #$beliefs, #$knows, and similar predicates (see
#$PropositionalAttitudeSlot), some of which will certainly
be included in future versions of OpenCyc. By and large,
these rules are not meant to explicate any particular
psychological or epistemic "theory", but (like the Cyc
ontology generally) are meant to represent commonsense
understanding of the notions in question. For example,
the rule
(#$implies
(#$knows ?AGENT ?PROP)
(#$trueSentence ?PROP))
|
means (roughly) that if somebody knows that P, then P must be true.
Is there a concept in OpenCyc of having a Subjective (how we see/use things) vs Objective (what things are) ontology?
Our intention at Cycorp is to represent the objective world and allow the representation of agents' beliefs. So we try to do both. Cyc knows that Paris is in France (objective) but one can also represent the notion that someone thinks Paris is under water (subjective). We don't have abstract modeling worked out yet, but the framework is mostly there.
Can a microtheory define (have) an assertion that conflicts with an assertion in another microtheory?
Yes! If I entered false information about a subject and then I entered true
information about the same subject, won't the two assertions clash? As long as the two assertions are not in the same microtheory, this is no problem for OpenCyc. That's the whole point of microtheories. In fact, you can have microtheories in which every assertion is believed to be false or mythological. So then how does OpenCyc decide which microtheory to believe? It depends on which microtheory you query from in the genlMt hierarchy.
What makes the OpenCyc release so historic and important?
From this point forward, real-world common sense can be expected to play an integral part of software applications. For the first time, the world's only large-scale, task-independent, language-independent, extensible, reusable, common-sense knowledge base is being made available to the world. Beginning now, software can become increasingly and arbitrarily smarter.
Is it a good thing to keep down the datasize for an Mt?
Keeping the datasize for an Mt down makes it easier for a person to review the material
in the Mt. Reasoning over sets of similar object types (rivers, cities, etc.) is sometimes
slowed down the larger the set that is examined. So if you ask a question in a
microtheory that inherits from SwedenNaturalGeographyMt but not from FinlandNaturalGeographyMt
(or the others) about rivers and the bodies of water into which they flow, the inference
engine would not have to consider the 100,000 lakes in Finland (or however many you enter)
and the rivers in the other Scandinavian countries (or the US).
With natural language input, this becomes more important. For example, if you include
data about all the named farmsteads in Norway and Sweden in the knowledge base and refer
to a farm in Sweden by name, the system could reject parses involving Norwegian farms
of the same name if you are asking the question in a microtheory that inherits from
SwedenGeographyMt and not from NorwayGeographyMt.
Are there specific rules which must be defined when creating a new Microtheory?
To create a new microtheory, all you have to do is assert (isa MyMicrotheory GeneralMicrotheory).
And make sure you place it in the hierarchy with (genlMt MyMicrotheory BaseKB). "BaseKB" can be replaced with whatever micrtheory you want the new Mt to inherit from.
When I create a new microtheory, do I have to specify its genlMt? Or is it defined by default?
#$genlMt is not defined by default when you create a new microtheory. You must (or your application must) assert it -- or else you could use a forward rule to do it.
Which microtheories should be used for geographical information?
Here is a hierarchy of existing world geography data microtheories with
descriptions of what goes in each Mt.
WorldNaturalGeographyVocabularyMt
| isas & genls about #$GeographicalRegions on the Earth as if
| humankind and none of its products existed
WorldNaturalGeographyMt
| Other assertions about #$GeographicalRegions that do not relate
| to #$GeopoliticalEntities or human constructs.
| WorldPoliticalGeographyDataVocabularyMt
| | #$isas & #$genls about cities and countries (#$GeopoliticalEntities)
| | not involving #$GeographicalRegions (parts of the Earth)
| WorldPoliticalGeographyDataMt
| | other assertions about #$GeopoliticalEntities as organizations
| | or #$Agents and not involving #$GeographicalRegions
WorldGeographyMt
| assertions relating #$GeopoliticalEntities to #$GeographicalRegions
| but not treating them as #$GeographicalRegions.
WorldGeographyDualistMt
assertions treating #$GeopoliticalEntities as #$GeographicalRegions
If different Cyc terms were used for, say, the territory (land) of
Sweden and the country (organization) of Sweden, then such
a microtheory would not be needed. This could be done
functionally if a reifiable unary function were created
to derive the territory from the #$GeopoliticalEntity
|
Is all knowledge about X (e.g. countries) stored in Y microtheory (e.g. #$WorldPoliticalGeographyDataVocabularyMt)?
No. Information about concepts in Cyc can be stored in many microtheories. For
objects in the world (instances of #$SomethingExisting) there is generally a most
general data microtheory in which they are defined. Assertions about such objects
(such as countries) belong in that #$definingMt or a more specialized microtheory.
For the specific case of modern countries, assertions using #$isa and #$genls to
non-spatial concepts would belong in #$WorldPoliticalGeographyDataVocabularyMt;
other assertions about the countries as #$Agents would go in
#$WorldPoliticalGeographyDataMt; assertions that also involved instances of
#$GeographicalRegion would go in #$WorldGeographyMt, and those that treated the
country as if it were a land mass (e.g. giving its area or stating that it
borders on another country) would go in #$WorldGeographyDualistMt.
If the context is far different from the turn of the third Millennium Earth,
new microtheories should be created. If you were reasoning about 17th Century
Europe, third century East Asia, or late third Era Middle Earth, you would not
want to use any of the above political Microtheories nor inherit any information
from them. In the Middle Earth case, you would also not want to inherit from the
Natural Geography #$DataMicrotheories.
What is the most general microtheory that should be used for data
about governments?
#$WorldPoliticalGeographyDataMt stores knowledge which relates organizations and
other agents.
What microtheory should be used for knowledge about the placement
of a #$GeopoliticalEntity, such as a country or city, on the globe?
Such assertions that treat a country as a piece of land belong in #$WorldGeographyDualistMt,
or a more specialized microtheory such as #$UnitedStatesGeographyDualistMt.
Does knowledge about #$geopoliticalSubdivision (e.g. a
city being part of a country) belong in #$WorldGeographyDualistMt?
No. Knowledge about #$geopoliticalSubdivision belongs in #$WorldPoliticalGeographyMt
because it relates two organizations. Cyc can infer that the territory of the
suborganization (City or State) is part of the territory of the larger organization
(State or Country).
All knowledge about the placement of regions in a country should
still be stored in WorldGeographyDualistMt, right?
Yes, so long as the "regions" are not instances of #$GeopoliticalEntity which are
#$geopoliticalSubdivisions of the country.
Is all knowledge about the placement of a subregion in a larger region
(neither of which are organizations) stored in WorldNaturalGeographyMt?
Yes. So long as no #$GeopoliticalEntity is involved, then all knowledge about the
placement of subregion in region (regions with no political significance) is stored
in #$WorldNaturalGeographyMt or in a more geographically limited microtheory such as
#$UnitedStatesNaturalGeographyMt or AntarcticaNaturalGeographyMt (which does not
currently exist).
My project involves geographic regions outside of the United States (for
example, in Norway, Sweden, and Finland). Should I create one Mt for each country for
which I plan to add a lot of geography data?
When deciding whether or not to create new Mts for each country for which you will
add geography data, it would be appropriate to add a set of microtheories, for example,
either for the set of Nordic countries or for each country. Cycorp has considered
creating them for each country, but has not yet taken that step. If you are going
to cover natural features and geopolitical entities and the relationships among them,
that would mean creating 6 (or more) Mts for each country. For example:
-
SwedenPoliticalGeographyDataVocabularyMt
-
SwedenPoliticalGeographyMt
-
SwedenNaturalGeographyVocabularyMt
-
SwedenNaturalGeographyMt
-
SwedenGeographyMt
-
SwedenGeographyDualistMt
If you do not see a need for separate microtheories for each country, you might not need them.
You could make, for example, EuropeanUnion...GeographyMt and NATO...GeographyMt, which would
have genlMts only to the members of those organizations.
It's actually probable that types of regions differ from country to country.
In Sweden, for example, we have loosely speaking two types of Cities and two types of
Counties. They each constitute a different way to divide the country into parts. How
should I represent this with microtheories?
If there are different ways to divide Sweden into parts, it would be useful to create
different specializations of City and County in #$SwedenPoliticalGeographyDataVocabularyMt
(or #$NordicPoliticalGeographyDataVocabularyMt). Separate microtheories would not be
needed to handle additional types of #$GeopoliticalEntity.
Is there some predicate like (somePredicate Collection1 CollectionSub1 CollectionSub2 CollectionSub3)?
We don't use many predicates with an arity of 4 (the great majority are binary), but you can ask OpenCyc. From the KB Browser, go to Tools, Ask, then enter this in the main field:
(and
(isa ?p Predicate)
(arity ?p 4)
(arg1Isa ?p Collection)
(arg2Isa ?p Collection)
(arg3Isa ?p Collection)
(arg4Isa ?p Collection))
|
Make sure you've typed BaseKB in the Mt: field.
OpenCyc doesn't have a specialized predicate like this, but now you know how to ask!
I am looking at the Opencyc SubL API doc, and I do not see the things I am looking for, like ASK, ASSERT, CREATE, etc. I would like to be able to write scripts to feed to the CYC(n): prompt. Is there any documentation available on this?
Yes. There is documentation available on how to write scripts to feed to the CYC(n): prompt. Take a look at Section 2.7 Inference Module. The java class org.opencyc.api.CycAccess and org.opencyc.api.UnitTest have many examples of using commands like cyc-create, cyc-kill, cyc-assert, etc. Note that the 'fi-' commands are deprecated.
How can I get to the command prompt and where can I read about it?
The OpenCyc SubL API is documented at http://www.OpenCyc.org/doc/cycapi -- that's how OpenCyc interacts with other software systems and with its agents. The command line is the same interface we expose at tcp port 3601, which you can telnet to. Port 3614 is the efficient binary port for which java client support classes are provided.
When something is asked of OpenCyc, is the asked assertion reified as fact somewhere in OpenCyc?
No, at this point asks are not entered into the KB from either the API nor the Query tool in the BK Browser. The RKF tools function differently, but they are not included in the beta releases.
Where does the name "Cyc" come from?
"Cyc" comes from the word "encyclopedia". The Cyc Project set out to represent in computer usable form the knowledge needed to understand an encyclopedia. Not the knowledge in an encyclopedia, but the things the author of an encyclopedia article feels are too broadly understood to bother writing. This knowledge that can be taken for granted is what we call "common sense".
What is an HL Module?
It's a module that hooks into OpenCyc's inference engine to dynamically solve certain kinds of problems. They can also be used for accessing certain kinds of data, for instance it could hook OpenCyc into some big 20GB of data. Right now, they can only be written in SubL, but it would be a fairly easy extension to allow them to be linked in, or written in Java, or something. There are plans for doing that.
What is SKSI?
SKSI is Semantic Knowledge Source Integration. The idea is that you map any knowledge source (e.g. a web page, a database) into Cyc via a detailed semantic schema mapping. Then you can ask Cyc queries and it can answer them based on information in any of the knowledge sources it has access to.
If you can write HL modules, you can roll your own SKSI functionality, so you're not dependent on waiting for Cycorp to release anything except the HL module hooks. But at Cycorp we're trying to make it easier and more automatic to hook in new knowledge sources. As of version 0.6b, you can write your own "slurper", which is a tool to slurp knowledge from some data source into Cyc, and store it as assertions in the KB.
How do you write a slurper?
Writing a slurper is only slightly more complex than conversion between data formats. It's like converting between, say, SQL and CycL. Let me give you an example. Say you have a database with just two fields, Paper and References, which are both strings. Then for each row in the database, you use the Cyc API to make an assertion of the form (paperReferences <arg1> <arg2>) where arg1 and arg2 are the two strings.
Can I import genetic information into Cyc?
Sure, you can import ANY kind of information into Cyc, you just have to have the underlying vocabulary to represent it, and if it isn't there, you can create it yourself. In the full KB we have lots of knowledge about biology and genetics, but this has not yet been released in OpenCyc.
What differentiates Cyc from, say, a neural net or expert system?
Being able to trail the line of thought. Cyc has a justification for everything it believes, and it always bottoms out into either "X told me so" or "some piece of code says so". A cool consequence of that is if the reason for Cyc believing something goes away, the consequences of that will also go away.
Is there no way to login to the browser, unless you work at cyc?
Anyone can login to OpenCyc. If you want to see the other pages that are not allowed to just the Guest login, then log in as CycAdministrator. CycAdministrator can see all pages. Guest and CycAdministrator are the only two logins included in an OpenCyc build.
What happens if Cyc decides that human beings are responsible for world chaos? It may decide to eliminate
us. Is Cyc controlled by Asimov's Laws of Robitics? Shouldn't it be?
Because Cyc currently cannot initiate any actions, we are safeguarded for now from the potential catastrophe you describe.
As Cyc grows in capability to the point where it can "do things" and perform self-improvement, we will have already incorporated enough commonsense knowledge