Macrocoder

Programming Macrocoder
Book 1

Books index

book 1 - Lifecycles, grammars and phases

book 2 - Core structure

book 3 - Semantic analysis

book 4 - Code generation


1 Programming Macrocoder

Macrocoder allows the development of macrocoding code generators in minutes. To allow this, a dedicated programming paradigm, called phase programming, has been developed.

This manual will go through all the basic concepts on which the Macrocoder programming is based.

1.1 Prerequisites

The prerequisites required for this tutorial are:

1.2 Rules and target: don't be confused

When working with a tool like Macrocoder, some confusion about roles might arise. A programmer "A" writes a FCL program that implements a compiler for a new language "L" she or he has invented. Then a programmer "B" writes a program in "L" using the compiler developed by "A" using Macrocoder. Both the programmers use Macrocoder, but with two different roles. We shall call who develops the language (i.e. programmer "A") the rules developer, while who uses the new language will be the target developer.

This entire document is dedicated to the rules developer.

1.2 Source files: don't be confused

Some confusion can arise also when talking about source files. Macrocoder project involve three kinds of source files and it is very important to be always confident to know which is which. These are the three categories with specified the name we shall use to refer each one of them:

  1. rule sources - are the source files of the rules project, written in the FCL language (extensions .fcl and .fcg); they define the macrocoding language and what code has to be generated;
  2. target sources - are the source files of the target project, written in the new language created by us;
  3. generated sources - are the source files generated by the code generator, usually in languages like C, Java, HTML etc.; strictly speaking, they are not source files for Macrocoder (indeed they are its output files), but since such files are normally referred as "source" we shall maintain this name.

2 Macrocoder programming basics

Macrocoder does its job by executing a program written in a language, similar to Java or C++, called "FCL". The program is written within the Macrocoder IDE which will take care of compiling and executing it.

The FCL language is based on common concepts: it has classes, methods, attributes, inheritance and interfaces, like Java or C++, plus some extra unique features designed specifically for macro code generation.

2.1 Classes and objects

The FCL language, as any other object-oriented one, supports the concepts of "class" and "object". The class is the description of the type: what method and attributes it has. The object is a segment of memory allocated to host the data requested by the class definition. If a class is a project of a car, an object is one real car of that model. The action of creating an object that follows the rules defined in a class is called instantiation.

In FCL some classes are created by the developer while other will be generated automatically by the Macrocoder environment. Finally, several utility classes are available from the included Macrocoder Library.

As we will see, this applies also to objects: some objects are created by the programmer, while other are automatically instantiated by the Macrocoder Runtime Environment.

2.2 Classes

A class definition in FCL is very similar to Java or C++:

class MyClass {
	// Numeric attribute
	Int myMumber;
	
	// Text attribute
	String myText;
	
	// Method
	Void showContents () const {
		system().msg << "Number:" << myNumber << " text:" << myText << endl;
	}
}

By default, all members are public. Although FCL supports public/protected/private as expected, they are seldom used due to greater power given by phase protection, that will be explained later.

2.3 Class extension

One of the unusual concepts widely used in FCL is class extension. Class extension allows to split the definition of a class among multiple files. For example, the following declaration generates the same class as the example above, but the three components (myNumber, myText and showContents) are added separately:

class MyClass {
	// Numeric attribute
	Int myMumber;
}

extend class MyClass {	
	// Text attribute
	String myText;
}
	
extend class MyClass {	
	// Method
	Void showContents () const {
		system().msg << "Number:" << myNumber << " text:" << myText << endl;
	}
}

The extension concept is very important in Macrocoder programming, because some parts of the process require to add methods to classes created internally by Macrocoder. Since internal classes do not have a source file with "class ...", the only way to add methods and attributes to them is by extension.

2.4 Primitive types

Macrocoder supports some primitive types. They are:

Int32-bit integer signed values
Huge64-bit integer signed values
Float64-bit floating point values
StringUNICODE string of any length
DecoStringUNICODE string of any length with decoration (bold, italic, font size, etc.)
BinaryGeneric binary data
VoidUsed in methods return type to represent procedures not returning any value.

There is no Boolean type: the boolean expressions are implemented with the Int type.

2.5 Composite types

The composite types are a mean to group different objects in a complex structure. Composite types are characterized by ownership: the container owns the contained object. This means that if the container is deleted, so they are the contained objects. Also, a given object can be owned at most by one container.

At this stage, we shall take a quick overview. We shall treat them in detail later on.

2.5.1 Class

The class is obviously the first way of create composite types. A class can contain attributes of other types. In this example, class Bar contains two instances of class Foo called f1 and f2:

class Foo {
	Int x;
	Int y;
}

class Bar {
	Foo f1;
	Foo f2;
}

2.5.2 Array

The array allows one class to own a variable number of instances of another class.

In the example below, the container class Bar can contain any number of instances of Foo (or any type derived from Foo) in an array named myFoos. The array can onw objects

class Foo {
	Int x;
	Int y;
}

class Bar {
	array of Foo myFoos;
}

2.5.3 Variant

The variant can contain zero or one element of the indicated type or derived. It is exactly like an array but limited to one element maximum.

class Foo {
	Int x;
	Int y;
}

class Bar {
	variant of Foo myFoo;
}

3 Lifesets and lifecycles

One of the key concepts in Macrocoder programming is the lifeset.

A lifeset does two things:

  1. it represents a namespace, i.e. a container of classes;
  2. it is a root objects container called cauldron;

A lifecycle is a container of lifesets. Most Macrocoder projects will work with one single lifecycle, whose default name is MAIN. For this tutorial we will assume that there is one single lifecycle with the default name MAIN.

3.1 Lifeset namespace

In Macrocoder, types can be divided in three families:

  1. utility library classes (like, for example, String);
  2. classes automatically generated by the Macrocoder environment;
  3. classes written by the FCL rules programmer;

The utility library classes are available globally and they can be used anywhere. Although they can be also user-defined, in most project they are represented only by the utility classes coming from the Macrocoder library. They include classes and functions to manipulate strings, files and so on.

The other two families are always created within a namespace called lifeset. In a Macrocoder project there is always one lifeset, but usually they have two (we'll se later why).

In this code snippet, we have defined the Foo class within a lifeset called CORE:

lifeset CORE;

class Foo {
	Int x;
	Int y;
}

Lifesets are created automatically as soon as they are "cited". As soon as the declaration lifeset X; is done, the lifeset X begin to exist in the project. Then, the declaration lifeset X; can be repeated any number of times (actually, it has to be repeated in every source file that refers to that lifeset).

3.2 Lifeset cauldron

The lifeset is not only a namespace for classes: it is also a root container for objects. Every lifeset has a singleton object, called cauldron that serves as root container for all the objects instanced within the lifeset itself.

In the Macrocoder language, all the objects are dynamic: they are normally destroyed as soon as the function that created them returns. The only way to keep them alive is to assign them to an owning container: an array, a variant or the lifeset cauldron.

Thanks to the above composite data types, the Macrocoder objects form trees, i.e. acyclic graphs that start from a single root object. All the root objects are stored in the "cauldron".

Example of objects organized in trees rooting from a lifeset cauldron.
Figure 3.2.1 - Example of objects organized in trees rooting from a lifeset cauldron.

4 Grammars

A Macrocoder program actually implements a compiler. The first thing every compiler does is to read the source file it has to compile. We know from the chapter Rules and target that the compiler is developed by the rules developer and the source file it reads is written by the target developer.

4.1 Syntax checking

The first step performed by Macrocoder is parsing, also known as syntax analysis. During this operation, the input source is verified against the syntax defined by the rules developer. Checking the syntax means verifying that the sequence of keywords and other elements conforms to the defined rules.

For example, this snipped conforms to Java syntax:

class HelloWorld {
}

Instead, although it contains exactly the same information, this snipped does not conforms to Java syntax:

HelloWorld class {
}

Indeed, the syntax of Java requires the keyword class followed by a variable identifier representing the name of the class, then followed by an open brace { and so on. The second example is not respecting this sequence, thus violating the syntax rules. The Java compiler would complain signaling Syntax error at line 1.

4.2 Grammar definition

In order to be able to validate the target source and show any violation of the syntax rules, Macrocoder, like any other parser, must be aware of the syntax rules currently in force.

This is done by creating a grammar definition to the Macrocoder rules project (see the beginners' guide to have step-by-step instructions on how to create and edit a grammar).

4.2.1 Sequence

A grammar definition is a tree of parsing rules that starts from a given root node. Let's see a Macrocoder very simple grammar definition designed to parse exactly the Java example above:

Example of a grammar definition

The above grammar definition contains one single rule called MyRule (1). The small red box containing the text "java" (2) means that this is the root rule for files whose file name extension is .java. In other word, every file *.java added to the target project, will be parsed starting from rule MyRule.

The rule specifies that a valid target source file must begin with the keyword class (3) followed by an identifier (4).

An identifier is any non empty sequence of A-Z characters, 0-9 numbers or underscores (_) that does not begin with a number and that does not match any defined kewyord. So abc is a valid identifer, while x@y is not because it contains an invalid character (@).

After the identifier, the grammar expects the keyword { (5) and the keyword } (6). Finally, the end of rule symbol (7), states that nothing else might follow.

The rule above is called a sequence because its items are to be expected once and in the specified sequence.

4.2.2 Repetition

The grammar defined in the previous chapter accepts this target source file:

class FirstClass {
}

However, it rejects this source file:

class FirstClass {
}

class SecondClass {
}

The reason is very simple: the above grammar expects one class name{} not class name1{} class name2{}.

If we want to be able to accept a sequence made of any number of class name{}, we must use a repetition:

Example of a repeated subrule within a grammar definition

The repetition symbol (1) means that the contained sub rule (2) can be repeated zero or more times. With this addition, a sequence of any number of class name{} is accepted.

4.2.3 Reference

The reference symbol allow to split a long rule into smaller sub rules; it also allows to reuse the same sub rule in multiple places.

Example of a reference

In this case, the OneClass sub rule has been placed in its own standalone rule (2). It is then referenced (1) by the MyRule rule.

4.2.4 Choice

The rules shown above were able to recognize a fixed sequence. However, mostly in every language, the grammar must be able to recognize different alternatives.

For example, in the snipped below we have a sequence of entries that can either be class or interface:

class FirstClass {
}

interface MyInterface {
}

class SecondClass {
}

This can be achieved using the choice symbol:

macrocoder tutorial basic img005

As the graph lines intuitively suggest, at the choice symbol the flow splits among multiple valid choices. The parsing can continue with a class keyword or with interface. The effect of the specification above is that the grammar expects a sequence of any number of class ... or interface ... written in any order.

4.2.5 Optional choice

Another very common case happens when an entry is optional. For example:

class Super {
}

class Sub extends Super {
}

The extends Super is a component that can be optionally present in a Java class definition.

This nothing more that a choice whose one branch is an empty rule:

macrocoder tutorial basic img006

Here we have again a choice symbol (1), but we activated the optional line (2) that tells that one of the options is to simply skip entirely all the other options.

4.2.6 Terminals

In a grammar definition, a terminal is a rule that can not be further expanded. Terminals are the leaves of the grammar tree. In the case of Macrocoder, terminals are keywords and fields.

macrocoder tutorial basic keywordThe keyword terminal matches exactly the text there indicated. It can be a text like class or a symbol like +.
macrocoder tutorial basic identifierThe identifier terminal matches any sequence of ASCII characters in the range A-Z, a-z and 0-9 plus the underscore symbol _. To be a valid identifier, the sequence must not begin with a number. Also, a valid identifier must not match words defined in the grammar as keywords. Valid examples are a and abc_123.
macrocoder tutorial basic quotedThe quoted terminal matches a string enclosed within quotes. For example, "Hello world!".
The characters within the quotes support some escapes exactly like it happens in C or Java. For example, the string My display is 15" wide, the quoted string must be written as "My display is 15\" wide".
macrocoder tutorial basic numericThe numeric terminal matches a number in various forms; a number always begins with a number in the range 0...9. Numbers can be expressed as decimal (e.g. 123), floating point (1.23, 0.2E+4, etc.) or exadecimal (0xAF12).
macrocoder tutorial basic freetextThe freetext terminal matches a special kind of text managed by the Macrocoder editor. This text mode is activated within the editor and it is evidenced because its background is colored in yellow.

For example, this rule:

macrocoder tutorial basic terms1

matches a string like this:

set myNumber = 1234

While this rule:

macrocoder tutorial basic terms2

matches a string like this:

set myText = "Hello world"

4.3 Grammar classes and objects

So far we have seen how the grammar rules can be defined. We have learned that the Macrocoder parser will be able to verify if a target source file complies with the grammar rules and, in case of violation, it will report a syntax error message.

The following step is to learn how the information gathered by the Macrocoder parser can be used by the rules programmer.

Let's take one of the previous examples:

set myText = "Hello world"

Thanks to the Macrocoder parser, we know that that target input source is correctly formed. That's certainly a good start, but now we need to know the name of the variable being set (myText) and the text being assigned to it (Hello world) so we can take the following steps in code generation. In ohter words, we want to access the parsing tree.

4.3.1 Grammar classes

Every Macrocoder grammar definition is associated to a lifeset. The lifesets that host grammar rules, called grammar lifesets, have some extra features but they behave exactly as every other lifeset. When defining grammar rules, the default name of the associated lifeset is GRAMMAR.

Macrocoder will make the parsing tree available through the automatic creation of classes and objects within its lifeset.

Let's start with a very simple example. We define this very simple grammar:

macrocoder tutorial basic img007

The Macrocoder grammar editor, using a green dotted box, shows us that for that rule it will create a class named MyRule (1) containing two attributes: varName (2) and value (3).

The class created automatically will look like this:

class MyRule: GBase {
	GString varName;
	GString value;
}

Being that class defined internally, there is no actual source to be browsed. However, it can be seen in the insight view: click on the Insight tab and browse to MAIN, GRAMMAR and MyRule:

macrocoder tutorial basic img008

4.3.2 Grammar objects

When a real target source file is fed to Macrocoder, its parser analyzes it and creates one or more objects instancing the classes shown in the previous chapter. The newly created objects are bound to the grammar lifeset cauldron.

With this target source as input:

set myText = "Hello world"

the Macrocoder parser would instantiate a single object of class MyRule and it would store it in the cauldron objects container of the lifeset named GRAMMAR. That object would have its attributes set as follows:

4.3.3 Composite grammar classes

Let's now take a look to a bit more complex grammar:

Grammar definition able to parse files
Figure 4.3.3.1 - Grammar definition able to parse files like example in figure 4.3.3.2.

This grammar is able to recognize a sequence of zero or more set x="..." statements:

set myText = "Hello world"
set yourText = "ABCD"
set hisText = "His display is 15\" wide"
Figure 4.3.3.2 - Example of a target source file that can be parsed by grammar defined in fiure 4.3.3.1.

In this case, Macrocoder will generate the same MyRule class as before, plus a new class named ManySets containing an array of MyRule named setEntries:

class ManySets: GBase {
	array of MyRule setEntries;
}

class MyRule: GBase {
	GString varName;
	GString value;
}
Figure 4.3.3.3 - Classes generated by Macrocoder for the grammar above.

When parsing the example above, Macrocoder will create one instance of ManySets associated to the cauldron; then, it will create three instances of MyRule that will be bound to the setEntries array of the ManySets instance:

  • lifeset GRAMMAR (cauldron)
    • obj1: ManySets
      • setEntries
        • child1: MyRule
          • varName = myText
          • value = Hello world
        • child2: MyRule
          • varName = yourText
          • value = ABCD
        • child2: MyRule
          • varName = hisText
          • value = His display is 15" wide
Figure 4.3.3.4 - Structure of objects resulting from parsing the target source file of figure 4.3.3.2.

5 Phases

The Macrocoder environment creates the initial objects that contain the information that has been read from the user target source files. From now on, this information has to be processed by our code until the final production of the output.

This process evolves through a sequence of steps in which each step is based on the results obtained by the previous ones, where the first step is the parsing done by Macrocoder. At this stage we will not go through the goals of the various steps yet: it will be the argument of the following chapters. Instead, we shall concentrate on the mechanism that allows these steps to be performed: the phase.

5.1 Phase rules

So far we have learned that the Macrocoder objects are organized in a tree where the ultimate root is the lifeset cauldron. A phase is a feature associated to a lifeset; it consists in traversing the tree reading/writing attributes or creating other objects to achieve a declared goal.

A lifeset can have any number of phases. To establish the execution order, i.e. which phase is to be executed first, every phase is assigned a unique number: lower numbers are executed first. For example, phase #24 will be executed before phase #1000. Phase numbers are floating point values: so if we want to add a phase between phases 1 and 2, we can use phase 1.5.

Phase execution starts as soon as Macrocoder has terminated successfully the target sources parsing and it has created the initial instances.

A phase execution consist into orderly scanning all the existing objects and calling a dedicated user-implemented method on each object.

The objects might have no method for a given phase: this is allowed because not all objects are always involved in all phases.

5.2 Execution order

Phases are executed by traversing the tree formed by the objects within a lifeset in a given order.

Macrocoder supports two execution orders:

In the table below the same example with the execution order in the father-first and children-first modes:

Order: children-firstOrder: father-first
  • lifeset GRAMMAR (cauldron) [12]
    • obj1: ManySets [11]
      • setEntries [10]
        • child1: MyRule [3]
          • varName [1]
          • value [2]
        • child2: MyRule [6]
          • varName [4]
          • value [5]
        • child2: MyRule [9]
          • varName [7]
          • value [8]
  • lifeset GRAMMAR (cauldron) [1]
    • obj1: ManySets [2]
      • setEntries [3]
        • child1: MyRule [4]
          • varName [5]
          • value [6]
        • child2: MyRule [7]
          • varName [8]
          • value [9]
        • child2: MyRule [10]
          • varName [11]
          • value [12]
Figure 5.2.1 - Examples of phase method execution order on the data generated by parsing the example at figure 4.3.3.2.

5.3 Phase method

The phase method is the method automatically invoked by Macrocoder during phase execution.

We shall see immediately a FCL example of a phase method. We consider again the example shown in chapter 4.3.3. In figure 4.3.3.3 we can see that Macrocoder has automatically created two classes: ManySets and MyRule.

The following code snipped implements a phase:

grammar GRAMMAR;

father-first phase ShowData = 1;

extend class MyRule {
	in phase ShowData {
		do {
			system().msg << "Set variable " << varName << " to value " << value << endl;
		}
	}
}	
Figure 5.3.1 - Implementation of the ShowData phase.

Let's comment the various lines:

The tree below reports the objects instanced for parsing the soruce file shown in figure 4.3.3.2. The objects involved in phase ShowData (i.e. those having a phase method) are evidenced:

  • lifeset GRAMMAR (cauldron)
    • obj1: ManySets
      • setEntries
        • child1: MyRule
          • varName = myText
          • value = Hello world
        • child2: MyRule
          • varName = yourText
          • value = ABCD
        • child2: MyRule
          • varName = hisText
          • value = His display is 15" wide
Figure 5.3.2 - Structure of objects resulting from parsing the target source file of figure 4.3.3.2 with evidenced those involved in phase ShowData.

It is now time to run the project. The complete rules and target projects for this example can be downloaded at this link: ManySets1.zip.

Start the execution by pressing the "run" icon:

Macrocoder output when program is executed
Figure 5.3.3 - Macrocoder output when program is executed.

Macrocoder will parse the target source file. Once done, it will execute the only existing defined phase, i.e. ShowData. The only phase method for that phase is associated to class MyRule. Since there are three instances of MyRule (because we wrote three lines of "set..." in our source file), the method will be executed three times. We can see the method output in the red box. Note that the variable names and their contents are colored in blue and underlined: if you click on them, Macrocoder will bring you to the exact source spot where that string has been defined.

6 Summary

In this book we have covered the following concepts about Macrocoder programming:

Go to book 2 »