Macrocoder

Programming Macrocoder
Book 3

Books index

book 1 - Lifecycles, grammars and phases

book 2 - Core structure

book 3 - Semantic analysis

book 4 - Code generation


1 Semantic analysis

In the previous books we worked with languages that were so simple that they required no semantic analysis at all. To understand what semantic analysis is and why we need it, we shall move to a new example of a custom language.

Let's create a language called Genealogy that allows us to define a family tree. The language syntax will have to look like this:

person Mike 
person John son of Mike
person Luke son of Mike
person Peter son of Luke
Figure 1.1 - Example of code written in the Genealogy language.

As we know, Macrocoder takes care of the syntax checking. For example, we will not be able to write person Mike father of John because father of is not one of the keywords we defined in our language. However, besides that, we can expect syntactically perfect source code that contains errors. For example, we could write:

person Mike son of John
person John son of Mike
Figure 1.2 - Example of code written in the Genealogy language containing a semantic error.

Although this code respects the syntax, it contains an obvious error: if Mike is John's father, John can not be Mike's father!

These kind of errors are called semantic errors because their detection depends on the logic meaning (called semantics) of the concepts expressed by the syntax.

This is the complete list of the semantic rules we need to enforce:

  1. every person has a unique name (i.e. only one person x is allowed for every x);
  2. some people might optionally have indication of their father name;
  3. if a person has his father indicated, his father must have been defined with person;
  4. the graph formed by the father-son relatioship must be a tree, i.e. it must be acyclic (if Mike is John's father, John can not be Mike's father);

In this book we shall leave code generation temporarily aside and concentrate on semantic analysis.

1.1 Preparing the "Genealogy" language

By using the concepts learned from the previous chapters, we can definine the Genealogy language, so far still without any semantic check. We begin with the grammar (file grammar.fcg):

Grammar for the Genealogy language
Figure 1.1.1 - File grammar.fcg: Genealogy grammar definition.

Then we can define the very simple CORE::Person class in the core.fcl file:

lifeset CORE;

class Person {
	// Unique name of this person
	LocString personName;
	
	// Name of the father or empty if not indicated
	LocString fatherName;
}
Figure 1.1.2 - File core.fcl: CORE class for the Genealogy project.

Then we need a createCore phase that creates the CORE objects from the GRAMMAR (file grammar_createCore.fcl):

grammar GRAMMAR;

father-first phase createCore = 1 creates CORE;

extend class Person {
	in phase createCore {
		link of CORE::Person itsCorePerson;
		
		do {
			var CORE::Person corePerson;
			itsCorePerson.set (corePerson);
			corePerson.personName.set (personName);
			lset.CORE.enroll (corePerson);
		}
	}
}

extend class FatherIndication {
	in phase createCore do {
		upscan(Person).itsCorePerson.fatherName.set (fatherName);
	}
}
Figure 1.1.3 - File grammar_createCore.fcl: implementation of the createCore phase in the GRAMMAR lifeset.

Finally we add a print phase in the core_print.fcl file just to see something when we will try it:

lifeset CORE;

phase print = 10;

extend class Person {
	in phase print do {
		system().msg << "person " << personName;
		if (fatherName.text.length () > 0) {
			system().msg << " son on " << fatherName;
		}
		system().msg << endl;
	}
}
Figure 1.1.4 - File core_print.fcl: implementation of the print phase in the CORE lifeset.

If we run the example of figure 1.1, we will obtain the following output:

Output of the execution
Figure 1.1.5 - Output of the execution of the program at figure 1.1.

Obviously, no semantic checks are implemented so far. Therefore, if we refer to unexisting parents or we define the same person twice, it will not complain.

The complete rules and target projects for at this stage can be downloaded at this link: Genealogy1.zip.

1.2 Checking names duplication

The first semantic check we will implement is to verify that the same person name is not used more than once.

This feature is obtained by using a lookup table, implemented by the Macrocoder composed type lookup_s of .... That composite type creates an indexed table where the key is a LocString and the value is a link to an object whose type has been specified. Let's do the magic (file core_register.fcl):

lifeset CORE;

father-first phase register = 1;

in phase register {
	shared lookup_s of Person knownPeople;
}

extend class Person {
	in phase register do {
		lset.knownPeople.set (personName, this);
	}
}
Figure 1.2.1 - File core_register.fcl: implementation of the register phase in the CORE lifeset.

Let's look at the code:

The method set of the lookup_s composite type detects and reports any duplication automatically.

Let's try to submit a target source containing a couple of duplications:

person Mike
person John son of Mike
person Luke son of Mike
person Peter son of Luke
person Mike
person Peter son of Mike
Figure 1.2.2 - Example of code written in the Genealogy language.

The last two lines are dupes of names declared before. If we run Macrocoder now, it will report:

Output of the execution in case of dupes
Figure 1.2.3 - Output of the execution of the program at figure 1.2.2.

Note that the dupe names are colored in blue: by clicking on them, the editor will jump directly on the spot where those names have been defined in the target source code.

The complete rules and target projects for at this stage can be downloaded at this link: Genealogy2.zip.

1.3 Verifying fathers' existence

Next semantic check will report unexising fathers. To do that, we shall reuse the knownPeople lookup table we already created in the previous step. We can create another source file named core_resolve.fcl:

lifeset CORE;

father-first phase resolve = 2;

extend class Person {
	in phase resolve {
		do {
			if (fatherName.text.length () > 0) {
				lset.knownPeople.get (fatherName);
			}
		}
	}
}
Figure 1.3.1 - File core_resolve.fcl: implementation of the resolve phase in the CORE lifeset.

The action is all at lines 8 and 9: if the fatherName is not empty (i.e., if the user specified a father for this person), the method will look for an object having key fatherName in the global lookup table lset.knownPeople. If the key is not found, Macrocoder will complain with an error.

In the following example, the person at the last line refers to an unexisting father named "Marcus":

person Mike
person John son of Mike
person Luke son of Mike
person Peter son of Luke
person Brian son of Marcus
Figure 1.3.2 - Example of code written in the Genealogy language containing a reference to an unexisting father.

The last two lines are dupes of names declared before. If we run Macrocoder now, it will report:

Output of the execution in case of unexisting reference
Figure 1.3.3 - Output of the execution of the program at figure 1.3.2.

1.4 Avoiding loops

The last check we need to implement is loop verification. The semantic check has to ensure that no one is defined as parent of one of his upper relatives, no matter how far in the parenthood chain.

This is obtained with a very simple modification. The changed lines are evidenced:

lifeset CORE;

father-first phase resolve = 2;

extend class Person {
	in phase resolve {
		dependent link of Person myParent;
		do {
			if (fatherName.text.length () > 0) {
				myParent.set (lset.knownPeople.get (fatherName), fatherName);
			}
		}
	}
}
Figure 1.4.1 - File core_resolve.fcl: implementation of the resolve phase in the CORE lifeset with loop verification.

The changes are very simple:

With this modification, at the end of phase resolve every instance of Person that has a parent, will have a pointer to the parent Person object stored in myParent. Being the myParent link defined as "dependent", Macrocoder will make sure that no loops occur among these instances.

In the following example, all the four listed people are involved in a loop:

person Mike son of Peter
person John son of Mike
person Luke son of Mike
person Peter son of Luke
Figure 1.4.2 - Example of code written in the Genealogy language containing loops.

When started, Macrocoder will abort the execution reporting the loops:

Macrocoder reporting loop errors
Figure 1.4.3 - Output of the execution of the program at figure 1.4.2.

Note that at line 10 of figure 1.4.1 the set method of the dependent link has been called with two parameters:

  1. myParent.set (lset.knownPeople.get (fatherName), fatherName) - first parameter queries the golbal lookup table knownPeople and retrieves the parent Person instance which becomes the target of the link (i.e. the object at which the link points to);
  2. myParent.set (lset.knownPeople.get (fatherName), fatherName) - second parameter is a LocString that contains the text that has to be displayed to represent this object in the error message in case a loop is detetcted; thanks to this string, a meaningful description, useful to fix the loop, of the involved objects can be given to the target user.

2 Summary

In this book we have covered the following concepts about Macrocoder programming:

« Go to book 2 Go to book 4 »