Lexemes

A Lexeme is a unit of meaning. One or more words can be attached to that meaning, these words are synonyms.

The AboditNLP assembly includes a small set of words and meanings; you can add your own dictionary containing just the words you need but you can also install one or both of the provided Wordnet packages: AboditNLP.Wordnet and Abodit.Wordnet.Extended. The former contains common words, the latter less common words. All animal names, plant names and genus names for example are in the latter.

In English, one meaning may be a subclass of another, or it might be an instance of a class. For example, labrador is a Dog and Dog is a Mammal. But these meanings don't necessarily form a simple tree so a normal C# class hierarchy cannot be used to model them. Instead, Abodit NLP uses C# interfaces to model meanings. This allows for multiple-inheritance. One remaining issue is the ability to distinguish between the word 'mammal' and the class of all 'mammal' words: by convention AboditNLP uses lower-case interface names to indicate specific instances of a word or meaning and capitalized interface names to indicate a class of meanings.

So Noun.dog is the word ""dog"" and any synonyms of ""dog"". Noun.Mammal is the class of all mammals and can recognize any word that is a mammal but Noun.mammal is the word ""mammal"" and any synonyms of it.

You can see the interfaces that have been created for words by visiting the demo page on this site. Type ""define mammal"" and you'll see all of the meanings that ""mammal"" can have and for each you will see the interfaces that it has.

These interface names don't include the customary initial letter 'I' because that would be cumbersome and confusing. They do include a number on the end to distinguish the different meanings one word can have.

When you add your own words you'll probably want to use the same namespaces: Noun, Verb, Adjective, ...

You can use inheritance on these to define classes of words that you want to recognize in one group, e.g. all positive adjectives.

If you add any Token classes these should also inherit the appropriate interface INoun, IVerb, IAdjective etc.

Nouns are further sub-classed into singular and plural, verbs by tense and adjectives by comparative or superlative.

namespace Noun
{
    public interface dog : INoun { }
    // You can also distinguish between singular or plural nouns by using interface inheritance like this:
    public interface cat : Noun.Type.Singular { }
    public interface cats : Noun.Type.Plural { }
}

When it comes to verbs you might want to define specific tenses of them, or have some catch-all interface that handles any tense of a given verb (or verb synset)

namespace Verb
{
    public interface report : Verb.Tense.Present { }
    public interface reported : Verb.Tense.Past { }
}

Use interface inheritance to group your meanings into single interfaces that can be recognized by a single rule.

public interface CompanyName : ProperNoun { }
public interface Microsoft : CompanyName { }
public interface Google : CompanyName { }
public interface Facebook : CompanyName { }

Having defined all the meanings that you want to use in your application you now need to define the words that have those meanings. You do this using a class that implements ILexemeGenerator like the one below:

public class SampleWords : ILexemeGenerator
{
    const string ns = ""yns:"";

    public void CreateWords(ILexemeStore store)
    {
        store.Store(Lexeme.New.Uri(ns + ""hello"").Noun(""hello"", typeof(Noun.hello)));
        store.Store(Lexeme.New.Uri(ns + ""world"").Noun(""world"", typeof(Noun.world)));
        store.Store(Lexeme.New.Uri(ns + ""goodbye"").Noun(""goodbye"", typeof(Noun.goodbye)));
        store.Store(Lexeme.New.Uri(ns + ""goodbye"").Noun(""bye"", typeof(Noun.goodbye)));

        store.Store(Lexeme.New.Verb(""report"", typeof(Verb.report), typeof(Verb.Tense.Present), typeof(Verb.Type.Transitive))
        // If you need to override an irregular verb, or add an interface to a specific tense you can do that
        .Past(""reported"", typeof(Verb.reported))
        // Verbs can have associated Noun forms, Adverb forms and Adjective forms, e.g. when you report something the thing is a 'report'
        .Noun(""report"", typeof(Noun.report)));

        ...
    }
}

For Uri's you can use anything. I suggest using a terse form of Uri like that used in Turtle/N3 (http://www.w3.org/TeamSubmission/turtle/)

Each lexeme is define using a fluent builder interface that can build graphs of words.The graph allows you to convert between singular and plural forms, between past and present tenses etc.The graph also represents synonym, antonym, holnym and meronym relationships.

When you add a word to the store, AboditNLP will automatically calculate the plural form of it, and will conjugate regular verbs for you.Of course the English language is never that simple so if you need to add irregular forms because the plural of ""octopus"" is ""octopi"", ""octopuses"" or ""octopodes"" you can do that too.

For each word you add to the dictionary you can attach any number of interfaces although you'd normally only attach one as you can represent the others through interface inheritance.

Classes instead of interfaces

When you retrieve a lexeme that was created using a list of interfaces, a dynamic object is created with all of those interfaces applied to it. Sometimes however you'll want to store your own objects in the lexeme store with additional data, either a real-active object or just an id for a database object.

To do this you can inherit from one of the Lexeme classes and set the Value property on it.

For example, here's a class defined to hold a reference to a HouseBaseObject:

public class HouseLexeme : LexNoun
{
    public HouseLexeme(string text, HouseBaseObject houseBaseObject)
        : base(SynSet.Get(houseBaseObject.UniqueID), text)
    {
        this.Value = houseBaseObject;
    }
}

And the corresponding LexemeGenerator:

public class HouseObjectLexemeGenerator : ILexemeGenerator
{
    private readonly IHouse house;
    public HouseObjectLexemeGenerator(IHouse house)
    {
        this.house = house;
    }

    public void CreateWords(ILexemeStore store)
    {
        foreach (var houseObject in house)
        {
            store.StoreSingleLexeme(new HouseLexeme(houseObject.Name, houseObject));
        }
    }
}

This places every house object into the LexemeStore at start-up but you can also dynamically add new lexemes at runtime by using the LexemeStore property on an instance of NLP.