Hide Table of Contents

COGBASE API Version 1

Daniel J. Olsher (dan //at// intmind.com)

Coming soon: Link to core COGBASE paper (in press)

Usage Instructions

All COGBASE APIs can be reached via HTTP calls:

http(s)://api.cogview.com/v1/< API Function >/

Single QUERY input parameters, typically a single concept or comma separated list of concepts, may be passed at the end of the URL as in the following example:

http://api.cogview.com/v1/databaseRelativeContentAmount/animal

All API functions return JSON data of the form indicated in the corresponding function description.

HTTP Return Codes:

200: Normal return; request and result are valid.
404: API name is incorrect, or spurious path elements have been added.
405: The requested method (GET or POST) is not allowed for this particular API call.
500: System error. Try changing parameters.

API

1. Semantic NLP // Document Processing

Semantic Document Gisting

/v1/documentSemanticConceptualGist

Given a document, selects those lexical items most semantically representative of the entire document.

Semantics are identified for each lexical item, and the most broadly representative are identified. Lexical items containing those semantics are then returned.

Example

Input:

I am a relativist who would like to answer your question, but the way you phrase the question makes it unanswerable. The concepts of "right" and "wrong" (or "correct/incorrect" or "true/false") belong to the domain of epistemological rather than moral questions. It makes no sense to ask if a moral position is right or wrong, although it is legitimate to ask if it is good (or better than another position).

Let me illustrate this point by looking at the psychological derivatives of epistemology and ethics: perception and motivation, respectively. One can certainly ask if a percept is "right" (correct, true, veridical) or "wrong" (incorrect, false, illusory). But it makes little sense to ask if a motive is true or false. On the other hand, it is strange to ask whether a percept is morally good or evil, but one can certainly ask that question about motives.

Therefore, your suggested answers (a)-(c) simply can't be considered: they assume you can judge the correctness of a moral judgment.

Now the problem with (d) is that it is double-barrelled: I agree with the first part (that the "rightness" of a moral position is a meaningless question), for the reasons stated above. But that is irrelevant to the alleged implication (not an implication at all) that one cannot feel peace is better than war. I certainly can make value judgments (bad, better, best) without asserting the "correctness" of the position.

Sorry for the lengthy dismissal of (a)-(d). My short (e) answer is that when two individuals grotesquely disagree on a moral issue, neither is right (correct) or wrong (incorrect). They simply hold different moral values (feelings).

...(signature)...

Output:

{article, say, id, like, know, two, person, disagree, right, one, wrong, sometimes, rarely, pretty, good, idea, one, wrong, never, information, make, best, really, must, make, decision, idea, right, judgement, peace, better, war, question, need, correct, answer, something, else, short, nice, hope, tell, assumption, value, real, statement, like, assumption, value, part, objective, reality, like, answer, question, way, phrase, question, make, concept, right, wrong, domain, question, make, sense, ask, position, right, wrong, ask, good, better, position, point, look, perception, motivation, one, ask, right, correct, true, wrong, incorrect, false, make, little, sense, ask, motive, true, false, hand, ask, good, evil, one, ask, question, answer, consider, judge, problem, agree, first, part, position, question, reason, state, one, feel, peace, better, war, make, value, bad, better, best, without, position, short, answer, two, individual, disagree, issue, neither, right, correct, wrong, incorrect, hold, value, john, department, state, state, behavior, sort, get, drink, pick, write, part, life}

Output compressed by count:

{moral: 6, ask: 6, question: 5, right: 4, make: 4, wrong: 4, position: 4, certainly: 3, better: 3, state: 3, one: 3, answer: 3, good: 2, implication: 2, degeneracy: 2, correct: 2, ...}

In

POST:
Document to be gisted (currently limited to 450 words)
GET: 
suppressFreqCounts (optional): Suppresses gistConceptFreqCounts (see below) in output

Out

semanticConceptualGist: List of most semantically representative concepts
gistMemberConcepts: Number of document lexical items that were included in gist
gistNonMemberConcepts: Number of excluded lexical items
gistConceptFreqCounts (optional): Dictionary containing concept frequency data  (number of times each lexical item 
appears within gist) as <concept, count> pairs - useful for determining relative word importance

Speed on Demonstration Server

Slower Than Average: currently limited to 450 words per input text

Topic Modeling

/v1/documentSemanticTopicModeling

Identifies topics (clusters of semantic features) within documents.

In

POST: 
Document (currently limited to 1000 words)

Out

topics: Identified topics in the form a list of lists, with each list (cluster) containing clustered semantic features

Speed on Demonstration Server

Slower than Average: currently limited to 1000 words per input text

Paper Reference:

Rajagopal, D., Olsher, D., Cambria, E., and Kwok, K. (2013) Commonsense-based topic modeling. ACM KDD, Chicago.

2. Semantic NLP // Lexical Item Processing: Concept-Based Algorithms

As is true of COGBASE algorithms in general, the algorithms in this section are purely meaning-based and involve no ontological reasoning (COGBASE is not ontology-based).

Category Decomposition / Concept Characterization

/v1/characterizerTimeOptimized
/v1/characterizerCoverageOptimized

COGVIEW and INTELNET, categories and concepts are understood as 'clouds' of meaning superimposed upon an underlying field of concepts and concept interactions (either COGBASE primitives or INTELNET energy-conveying edges).

Using commonsense knowledge, this algorithm discovers a set of concepts that, taken together, define the content field of a particular category or concept. As an example, the concept dog might include field regions dealing with physical appearance, typical sounds made, foods eaten, relations to other pets, typical behaviors, and so on. In COGBASE, knowledge from one or more subfields is selected at runtime based on contextual needs.

The algorithm includes two versions: Time-Optimized and Coverage-Optimized. Time-Optimized selects for a quicker response at the expense of the amount of data scrutinized to determine characterizations, and Coverage-Optimized the reverse. The Coverage-Optimized algorithm can be used to extend the 'radius' of the field having the input concept at its center.

In addition to a list of concepts characterizing a particular field, the algorithm also returns Relative Anchor Importance Index (RAII) values indicating the importance of each individual concept relative to the overall field definition.

Also returned is a flag (lowConceptEntropy) indicating whether sufficient database knowledge regarding the input concept was found in order to enable execution of the standard noise filtering algorithm. The low-entropy filter demands less corroboration of information before accepting it as valid. In theory, this lowered burden could admit more noise, but in practice, accuracy tends to be very good.

Example

Input:

dog

Output (without RAII scores):

dog, animal, pet, mammal, breed dog, cat, carnivore, species, often, bird, canine, popular pet, curious animal, home animal, noun, wild animal, plural cat, food, baby dog, small dog, household animal, live entity, domestic animal, common pet, domestic pet, feline, furry, man best friend, furry pet, pet animal, man, good pet, quadriped, domesticate animal, small animal, type dog, household pet, plant, young dog, predator, house pet, furry animal, person, golden retriever, beautiful animal, house animal, musician, rabbit, face other, hunt animal, ear

In

GET URL: Concept
QUERY: suppressRAII (optional): Suppress return of RAII data

Out

conceptAnchors: list of [concept, RAII] pairs if suppressRAII = False /OR/
                list of concepts if suppressRAII = True /OR/
                empty list if KB does not contain enough knowledge to enable any version of the algorithm

lowConceptEntropy: Boolean flag indicating if low-entropy filtering algorithm was employed

Semantic Category Membership Degree

/v1/categoryMembershipDegree

This algorithm provides a semantics-based, knowledge-driven mechanism for calculating a concept's degree of membership within a specific category. The algorithm calculates degree based on a wide range of potential matches, ranging from categorical/ontological ('chair' and 'furniture') to general semantic ('meat' and 'tasty'). In the latter case, the system is able to discover a large number of potential ways in which one concept might fit a category without requiring all possible category or descriptive word/object cases to be enumerated in advance.

In addition to degree of fit, the algorithm optionally returns a list of concepts shared between the concept and category fields; this list provides some information on how membership degree was determined.

Example:

Input:

meat,tasty

Output:

[food, 2.0], [animal, 1.73554] Match score: 1.86777

Intermediate bases:

[[food, 110], [animal, 100], [mammal, 50], [pork, 50], [beef, 40], [farm animal, 30], [bird, 30], [barn animal, 30], [lamb, 30], [goat, 30], [bone, 30], [chop, 30], [sheep, 30], [barnyard animal, 30], [ham, 30], [turkey, 30], [pig, 30]]

In

QUERY: concept: Concept to be tested
       category: Category that membership is tested within
       suppressBases (optional): suppress base information

Out

membershipDegree: degree of fit
coreBases:  List of core concepts shared between concept and category fields
intermediateBasesOfComparison: Intermediate bases of comparison between concept and category

Semantic Feature Generation

In many NLP and machine learning applications, it is useful to be able to generate features for lexical items encountered in input texts, social media posts, etc. Access to such features enables a shift from bag of words to bag of concepts, in which lexical items have deeper meaning and are able to contribute more than just their physical form during processing.

In COGBASE, knowledge is structured in the form of 'atoms', which, taken together, build up meanings. Given this, the simplest way to generate features from lexical items is to identify edges from/to concept nodes and collect the concepts those edges link to. This method, of course, ignores the varying COGBASE primitives attached to each edge, and does not perform any noise filtering, but it does provide an initial step in the right direction.

Recall here that COGBASE edges do not denote child/parent ontological relationships; rather, they include primitives that build meanings. This suggests that the concepts returned by the procedure above will combine a number of different semantic relationships, and will include some noise. Broadly speaking however, this procedure is sufficient support for pattern-finding algorithms. Should a more accurate picture of a concept be required, other COGBASE algorithms may be combined to provide this, but generally require somewhat more computation.

Outbound COGBASE edges denote meaning by which a concept itself is defined by the aggregate of other information, and inbound edges the process by which other concepts use a particular concept to define themselves.

Note: Under Version 1 of the API, features returned via the functions in this section are limited in that only a small number of the full set of features for each concept are returned; this is to limit server traffic as well as avoid wholesale downloading of the KB. If access to full feature sets is desired, please contact the author.

/v1/featuresContributedFromThis

Example

Input:

dog

Output:

[small, mammal, animal meat, physical measure, hunter, chew, carnivore, companion animal, cat chase, nonhuman animal, common fear, routine surgery, sign, broad subject, dead animal, high risk occupation, street, design, guard animal, common household pet, non-native predator species, coyote, old lady walk, untrain cage animal, better sense smell person, sausage, product category, grown puppy, short-lived animal, resident pet, move part, small horse, unwant animal, live entity, paw instead hand, go out better let, unwant visitor, suspect, companion, farm animal, new, faithful pet, non-human animal, simple piece, bird, hairy, animal model, interest group, leg, eutherian]

In

QUERY: 
concept: Concept for which features are sought

Out

random-selected-features: randomly selected features as list of concepts

/v1/featuresContributedToThis

Example

Input:

dog

Output:

[relate wolf, spot movement, scratch, bear, zoey, bichon frise, chow chow, robin hood, mother young, psycho, old english sheepdog, smell scent, muzzle, participate game, great pet, obedience, irish setter, shar pei, smell drug, circle house, hear many noise human, fetch object throw, tail, blaze, learn trick, dog show, carrry mouth, guard premise, breed several puppy, airedale terrier, use pen, run fetch frisbee, guard house, gina, cocker spaniel, mother puppy, coursing, listen sound, jack, ration_fac_tn, akita, pariah dog, train catch frisbee, malamute, die, marry, gsd, wire-haired, smell well]

In

QUERY: 
concept: Concept for which features are sought

Out

random-selected-features: randomly selected features as list of concepts

/v1/featuresContributedToAndFromThis

In

QUERY: 
concept: Concept for which features are sought
returnSeparateToAndFrom: returns features as featuresTo and featuresFrom, instead of combined

Out

random-selected-features as list of concepts if returnSeparateToAndFrom = False /OR/
random-selected-featuresTo and random-selected-featuresFrom (concept lists) if returnSeparateToAndFrom = True

3. Commonsense Reasoning

COGBASE reasoning is cross-domain and highly contextual, combining multiple 'atoms' of information in order to achieve nuanced reasoning.

The following algorithms demonstrate some of the many problems that can be solved using COGBASE data.

Goal Inference

/v1/goalInference

Given that a user has shown interest in some set of items, this algorithm determines what goals the combined presence, use, or acquisition of the given concepts is likely to support.

The input may also take the form of a single concept representing an object, action, or state. In the case of an object, the concept is interpreted as something which has been acquired to help achieve some (unknown/unstated) set of goals, and determines what those goals could be. In the case of a single action, the system returns goals which have that action as a component. Finally, in the case of world states (happy, for example), the algorithm discovers goals that could have generated those states and/or that involve objects that can take on those states. In the latter case, the system may also return facilitation nodes (ending in _fac_tn) indicating specific actions that can be taken in order to generate those states.

Examples

ham, bread => sandwich

fork, knife => eat food, eat food off plate, eat

dog => love, comfort elderly, protect belongings, play, guard property

kick => swim, make mad, swimmer, fight, move ball, soccer

happy => life go well, everyone, cat purr, score home run, good grade_fac_tn, child smile, person smile, find lose item, live life_fac_tn, enjoy day_fac_tn, everybody, pay, discover truth_fac_tn, get good grade, win baseball game, taste sweet, enjoy company friend, meet friend, gather energy tomorrow, money, party, celebrate_fac_tn, know healthy, happy, almost everyone, hear sing, surprise_fac_tn, mary sad mary, see idea become reality, read child, remember phone number, celebrate, buy present others, chat friend, love else, cash, download anachronox demo, love another, person, cheer, good lover, smile make person, enjoy day, mother, fun

In

QUERY:
concepts: a comma-separated list of input concepts

Out

goals: list of output goals per description above

Deep Commonsense Context Awareness / User Concept Interests / Search Augmentation

/v1/deepCommonsenseContext

This algorithm fulfills three different use cases. Overall, the goal is, given either 1) a concept or 2) a concept and a selector of some facet of that concept, to discover which other concepts are most likely to be of interest in the context established by the input.

In search augmentation, the output of this algorithm can be combined with original search terms in order to yield more useful queries.

In the case of user interests, given that a user is interested in an input concept the algorithm determines what other concepts the user is likely to share interest in.

When given a facet selector concept, two sub-cases are enabled. In the first sub-case, a concept with multiple senses is narrowed down to only one, specified by a single concept. An excellent example is bank, which can refer either to an institution that manages money or to the side of a river. In this case, if the facet selector is money-related (account, withdrawal, etc.), that sense will be selected and output will be filtered accordingly. Critically, knowledge engineers need not specify which selectors correlate with which senses; the system is able to use the totality of the knowledge base to automatically determine selector-data boundaries.

In the concept-breaking sub-case, a single, complex concept with many facets is broken up and data related to one particular facet is selected for output. As an example, the concept China refers to many things: a physical country located in Asia, a government, a people, various provinces and languages, and so on. The selector term allows the user to choose which aspect of the larger concept they are interested in, and knowledge related to this aspect will be selected.

The system is also able to expand the space of data taken into account by considering information related to the categories to which input concepts belong.

Examples

Search Aug: Earthquake (for the curious, see: Search Aug: Terrorism

Facets and Selector Concepts

China/government: govern, authority, money, organization, information, system, records, president, country, property

China/Asia: continent, unite[d] state[s], nation, border, queen, america, origin, tropical country, continental area, popular destination, develop country, rapidly develop economy, earth, regional market, geography, property market, hong kong island

In

QUERY:  
concept: main input concept
useInCategories (optional): Enables expansion to inbound categories (other concepts that explicitly belong to the category described by this concept)
useOutCategories (optional): Enables expansion to outbound categories (those categories this concept explicitly belongs to)
selectorConcept (optional):  Concept used to select which aspect of input concept is of interest
useDataConfidenceScores (optional): Use original confidence scores of input data sources as part of output ordering

Out

results: A list of tuples containing the following fields:
    concept: A result concept
    count: Number of times this concept appears across the KB in the context generated by the input concept
    sortScore: Used to order tuples
    confidenceScore: Reflects the combined confidence of the data from which this result was generated

startingConcepts: A list of the concepts about which data was retrieved from the KB, comprised of the input concept plus those additional concepts identified by the category expansion mechanism (if in/out categories were used)

4. Commonsense Prediction (Past, Future)

Given the large amount of information in COGBASE, it is possible to make predictions about the past and future given information about the present.

Future Prediction

/v1/expectationsFutureTelic/
/v1/expectationsFutureAtelic/

This algorithm, given information that a concept is present now, predicts concepts likely to obtain in the future.

For actions, the algorithm can provide results targeted more to the period when the action is in progress (the telic variant) or the period when the action is completed (atelic).

Example

Sleep (atelic):

waste time, maintain health, rest, sleep, feel much lighter, death, re energize, rejuvinate body, refreshment, feel rest, wake up, rest dream, get better, rest body mind, rejuvenate body, restore mind, person feel better, snore, gain energy, lose job, awakeness, improvement health, energy, replenish neurotransmitter, refresh, bedsore, miss appointment, nightmare, not feel tire anymore, breathe problem surface, lazy, get proper amount rest, relaxation, no long tire, might dream, rest recharge, deep breathe, eat breakfast, escape world, lie bed close eye, wake up hungry, restore body, wake up fully rest, regeneration, feel better, take break work, bedtime, body feel comfortable, run out steam, not tire, pass hour dark, turn off light, drool, restore vitality, relax, rejuvination, baby, pass time, lie down feel sleepy, wake up morning rest, rejuvenation, slumber, fun, refresh mind, rejuvenate, recuperate, restful mind, become little tire, lie down, recover, refresh memory, delay, lucid dream, stay bed, feel energize, make tire person little tire, get rest, rejuvenation, rest mind body, close eye, re-energize, maintain sanity, dont feel sleepy anymore, let body rest, satiate need sleep, become rest, dream, release energy

In

QUERY:  
concept: main input concept

Out

expectations: list of concepts likely to obtain in future

Past Inference

/v1/expectationsPast/

Given that some concept obtains in the present, what concepts were likely present in the recent past?

Example

angry:

stub toe, read newspaper, jump up down, punish, punch, irritate, fight, person, watch television show, mad, fix computer, wait line, involve accident

In

QUERY:  
concept: input concept obtaining at present time

Out

expectations: list of concepts likely to obtain in recent past

5. Low-Level Commonsense Data Extraction

Concept Information Specificity

/v1/conceptInformationAndCategorySpecificity

This utility function, calculated based on the ratio of inbound to outbound category links, answers two questions: 1. the specificity of the category represented by a particular concept, and 2. the amount of information conveyed by the use of a particular concept as opposed to another.

As an example, the concept place (semspec 0.00314) is more general (less specific) than United States (semspec 11.0).

In

QUERY:
concept: Input concept

Out

specificity: Specificity score (higher is more specific)

Shared Semantic Touchpoints

/v1/conceptSharedSemanticTouchpoints

Given two concepts, this algorithm finds other concepts that, topologically within the KB, those concepts have in common (i.e. they share links).

Example

Input:

acid, base

Output:

theory of dissociation, aqueous liquid, reaction parameter, bile salt, chemical liquid, inorganic chemical, electrolyte, ammonia, conductive material, reactive chemical, environment, program, fuel, ingredient, mixture, combination, material, chemical concept, deamination, reagent, compound, desirable quality, chemical substance, term ... function, traditional general chemistry topic, form, brand, catalyst, constituent, raw material, list material, key word, oxidize agent, stabilizer, inorganic catalyst, volatile compound, agent, ionic compound, topic, volatile organic compound, harsh condition, feature, chemical, parameter, product, object, ph modifier, optional component, chemical compound, water treatment chemical, ionizable compound, class, alcohol, ionic species, chemical additive, liquid, metal, element

In

QUERY:
conceptList: Comma-separated list of two concepts

Out

sharedTouchpoints: List of shared concept touchpoints

Concept Atom Degree

/v1/conceptAtomDegree

This algorithm returns the number of atoms present (node degree) in the knowledge base with respect to particular concepts.

In

QUERY: 
concept: Concept for which data is retrieved

Out

atomDegree: Integer atom count for input concept

Concept Relative Content Amount

/v1/databaseRelativeContentAmount

This algorithm provides a score describing the amount of information contained within the COGBASE KB for the input concept relative to other concepts in the database.

In

QUERY:
concept: Concept for which data is to be retrieved

Out

relativeContentAmountIndex: Float value indicating amount of data present for input concept relative to other concepts

 

6. Sample Algorithms Not Included In API Version 1

Sense

Data-Driven Sense Induction

Semantic Sense Testing

Psychologically-Aware Reasoning

Action to Emotion Inference

Sentiment Analysis

Commonsense-based polarity detection

General positive/negative determination