Variables

M does not require predefinition of variable type or size. M variables are either local or global. Any variable may be unsubscripted or subscripted.

Arrays and Subscripts

In M, subscripted variables identify elements in sparse arrays. Sparse arrays comprise existing subscripts and data nodes -; no space is reserved for potential data nodes. These arrays generally serve logical, rather than mathematical, purposes.

M array subscripts are expressions, and are not restricted to numeric values.

The format for an M global or local variable is:

[^]name[(expr1[,...])]
  • The optional leading caret symbol (^) designates a global variable.

  • The name specifies a particular array.

  • The optional expressions specify the subscripts and must be enclosed in parentheses and separated by commas (,).

The body of the M standard places no restrictions on variable names. However, the portability section of the standard does suggest limits on the length of an individual subscript expression, and on the total length of a variable name. The measurement for the length of names includes the length of the global variable name itself, the sum of the lengths of all the evaluated subscripts, and an allowance for an overhead of two (2) times the number of subscripts. The total must not exceed 237. For globals, GT.M permits this total to be modified with GDE up to 255. For locals, GT.M limits the length of individual subscripts to the maximum string length of 32,767. GT.M restricts the number of subscripts for local or global variables to 31.

M Collation Sequences

M collates all canonic numeric subscripts ahead of all string subscripts, including strings such as those with leading zeros that represent non-canonic numbers. Numeric subscripts collate from negative to positive in value order. String subscripts collate in ASCII sequence. In addition, GT.M allows the empty string subscript in most contexts, (the null, or empty, string collates ahead of all canonic numeric subscripts).

GT.M allows definition of alternative collation sequences. For complete information on enabling this functionality, See Chapter 12: “Internationalization.

Local Variables

A local variable in M refers to a variable used solely within the scope of a single process. Local variable names have no leading delimiter.

M makes a local variable available and subject to modification by all routines executed within a process from the time that variable is first SET until it is KILLed, or until the process stops executing M. However, M "protects" a local variable after that variable appears as an argument to a NEW command, or after it appears as an element in a formalist used in parameter passing. When M protects a local variable, it saves a copy of the variable's value and makes that variable undefined. M restores the variable to its saved value during execution of the QUIT that terminates the process stack level associated with the "protecting" NEW or formalist. For more information on NEW and QUIT, see Chapter 6: “Commands.

M restricts the following uses of variables to local variables:

  • FOR command control variables.

  • Elements within the parentheses of an "exclusive" KILL.

  • TSTART [with local variables list].

  • A KILL with no arguments removes all current local variables.

  • NEW command arguments.

  • Actualnames used by pass-by-reference parameter passing.

Global Variables and Resource Name Environments

M recognizes an optional environment specification in global names or in the LOCK resource names (nrefs), which have analogous syntax. Global variable names have a leading caret symbol (^) as a delimiter.

M makes a global variable available, and subject to modification by all routines executed within all processes in an environment, from the time that variable is first SET until it is KILLed.

Naked References

M accepts an abbreviation of the global name under some circumstances. When the leading caret symbol (^) immediately precedes the left parenthesis delimiting subscripts, the global variable reference is called a naked reference. M evaluates a naked reference by prefixing the last used global variable name, except for its last subscript, to the list of subscripts specified by the naked reference. The prefixed portion is known as the naked indicator. An attempt to use a naked reference when the prior global reference does not exist, or did not contain a subscript, generates an error.

Because M has only one process-wide naked indicator which it maintains as a side affect of every evaluation of a global variable, using the naked reference requires an understanding of M execution sequence. M execution generally proceeds from left to right within a line, subject to commands that change the flow of control. However, M evaluates the portion of a SET command argument to the right side of the equal sign before the left side. Also, M does not evaluate any further $SELECT() arguments within the function after it encounters a true selection argument.

In general, using naked references only in very limited circumstances prevents problems associated with the naked indicator.

Global Variable Name Environments

M recognizes an optional environment specification in global names. The environment specification designates one of some set of alternative database files.

The syntax for global variable names that include an environment specification is:

^|expr|name[(subscript[,...])]

In GT.M, the expression identifies the Global Directory for mapping the global variable.

Environment specifications permit easy access to global variables in alternative databases, including other "copies" of active variables in the current database. Environment specifications are sometimes referred to as extended global syntax or extended value syntax.

GT.M also allows:

^|expr1,expr2|name[(subscript[,...])]

Where the first expression identifies the Global Directory and the second expression is accepted but ignored by GT.M.

To improve compatibility with some other M implementations, GT.M also accepts another non-standard syntax. In this syntax, the leading and trailing up-bar (|) are respectively replaced by a left square-bracket ([) and a right square-bracket (]). This syntax also requires expratoms, rather than expressions. For additional information on expratoms, see “Expressions”.

The formats for this non-standard syntax are:

^[expratom1]name[(subscript...)]

or

^[expratom1,expratom2]name[(subscript...)]

Where expratom1 identifies the Global Directory and expratom2 is a dummy variable. Note that the first set of brackets in each format is part of the syntax. The second set of square brackets is part of the meta-language identifying an optional element.

Example:

$ gtmgbldir=Test.GLD
$ export gtmgbldir
$ GTM
  
GTM>WRITE $ZGBLDIR
TEST.GLD
GTM>WRITE ^A
THIS IS ^A IN DATABASE RED
GTM>WRITE ^|"M1.GLD"|A
THIS IS ^A IN DATABASE WHITE
GTM>WRITE $ZGBLDIR
TEST.GLD
GTM>HALT
  
$ echo gtmgbldir
TEST.GLD

The statement WRITE ^|"M1.GLD"|A writes variable ^A using the Global Directory, M1.GLD, but does not change the current Global Directory.

Example:

GTM>WRITE $ZGBLDIR
M1.GLD
GTM>WRITE ^A
THIS IS ^A IN DATABASE WHITE
GTM>WRITE ^|"M1.GLD"|A
THIS IS ^A IN DATABASE WHITE

The statement WRITE ^|"M1.GLD"|A is equivalent to WRITE ^A.

Specifying separate Global Directories does not always translate to using separate databases.

Example:

GTM>WRITE ^|"M1.GLD"|A,!,^|"M2.GLD"|A,!,^|"M3.GLD"
|A,!
THIS IS ^A IN DATABASE WHITE
THIS IS ^A IN DATABASE BLUE
THIS IS ^A IN DATABASE WHITE

In this example, the WRITE does not display ^A from three GT.M database files. Mapping specified by the Global Directory Editor (GDE) determines the database file to which a Global Directory points.

This result could have occurred under the following mapping:

^|"M1.GLD"|A --> REGIONA --> SEGMENTA --> FILE1.DAT
^|"M2.GLD"|A --> REGIONA --> SEGMENT1 --> FILE2.DAT
^|"M3.GLD"|A --> REGION3 --> SEGMENT3 --> FILE1.DAT

For more information on Global Directories, refer to the "Global Directory Editor" chapter of the GT.M Administration and Operations Guide.

Optional GT.M Environment Translation Facility

For users who wish to dynamically (at run-time) determine a global directory from non-global directory information (typically UCI and VOL) in the environment specification, GT.M provides an interface to add an appropriate translation.

Using this facility impacts the performance of every global access that uses environment specification. Make sure you use it only when static determination of the global directory is not feasible. When used, make every effort to keep the translation routines very efficient.

The use of this facility is enabled by the definition of the environment variable gtm_env_translate, which contains the path of a shared library with the following entry point:

gtm_env_xlate

If the shared object is not accessible or the entry point is not accessible, GT.M reports an error.

The gtm_env_xlate() routine has the following C prototype:

int gtm_env_xlate(gtm_string_t *in1, gtm_st
   ring_t *in2, gtm_string *in3, gtm_string_t *out)

where gtm_string_t is a structure defined in gtmxc_types.h as follows:

typedef struct
{
  int length;
  char *address;
}gtm_string_t;

The purpose of the function is to use its three input arguments to derive and return an output argument that can be used as an environment specification by GT.M. Note that the input values passed (in1, in2 and in3) are the result of M evaluation and must not be modified. The first two arguments are the expressions passed within the up-bars "| |" or the square-brackets "[ ]", and the third argument is the current working directory as described by $ZDIRECTORY.

A return value other than zero (0) indicates an error in translation, and is reported by a GT.M error

If the length of the output argument is non-zero, GT.M appends a secondary message of GTM-I-TEXT, containing the text found at the address of the output structure.

GT.M does not do any memory management related to the output argument - space for the output should be allocated by the external routine. The routine must place the returned environment specification at the address it has allocated and adjust the length accordingly. On a successful return, the return value should be zero. If the translation routine must communicate an error to GT.M, it must return a non-zero value, and if it is to communicate additional error information, place the error text at the address where the environment would normally go and adjust the length to match the length of the error text.

Length of the return value may range from 0-32767, otherwise GT.M reports an error.

A zero-length (empty) string specifies the current value of $ZGBLDIR. Non-zero lengths must represent the actual length of the file specification pointed to by address, excluding any <NUL> terminator. If the address field of the output argument is NULL, GT.M issues an error.

The file specification may be absolute or relative and may contain an environment variable. If the file specified is not accessible, or is not a valid global directory, GT.M reports errors in the same way it does for any invalid global directory.

It is possible to write this routine in M (as a call-in), however, global variables in such a routine would change the naked indicator, which environment references normally do not. Depending on the conventions of the application, there might be difficult name-space management issues such as protecting the local variables used by the M routine.

While it is possible for this routine to take any form that the application designer finds appropriate within the given interface definition, the following paragraphs make some recommendations based on the expectation that a routine invoked for any more than a handful of global references should be efficient.

It is expected that the routine loads one or more tables, either at compilation or the first time it is invoked. The logic of the routine performs a look up on the entry in the set of tables. The lookup might be based on the length of the strings and some unique set of characters in the names, or a hash, with collision provisions as appropriate.

The routine may have to deal with a case where one or both of the inputs have zero length. A subset of these cases may have the first string holding a comma limited string that needs to be re-interpreted as being equivalent to two input strings (note that the input strings must never be modified). The routine may also have to handle cases where a value (most likely the first) is accidentally or intentionally, already a global directory specification.

Example:

$ cat gtm_env_translate.c
#include <stdio.h>
#include <string.h>
#include "gtmxc_types.h"
static int init = 0;

typedef struct
{
  gtm_string_t field1, field2, ret;
} line_entry ;

static line_entry table[5], *line, linetmp;
/* Since these errors may occur before setup is complete, they are statics */
static char *errorstring1 ="Error in function initialization, environment variable GTM_CALLIN_START not defined. Environment translation failed.";
static char *errorstring2 ="Error in function initialization, function pointers could not be determined. Envrironment translation failed.";

#define ENV_VAR"GTM_CALLIN_START"
typedef int(*int_fptr)();
int_fptr GTM_MALLOC;

int init_functable(gtm_string_t *ptr)
{
/* This function demonstrates the initialization of other function pointers as well (if the user-code needs them for any reason, they should be defined as globals) */
char *pcAddress;
long lAddress;
void **functable;
void (*setup_timer) ();
void (*cancel_timer) ();

pcAddress = getenv(ENV_VAR);
if (pcAddress == NULL)
{
ptr->length = strlen(errorstring1);
ptr->address = errorstring1;
return 1;
}
lAddress = -1;
lAddress = atol(pcAddress);
if (lAddress == -1)
{
ptr->length = strlen(errorstring2);
ptr->address = errorstring2;
return 1;
}
functable = (void *)lAddress;
setup_timer = (void(*)()) functable[2];
cancel_timer = (void(*)()) functable[3];
GTM_MALLOC = (int_fptr) functable[4];
return 0;
}

void copy_string(char **loc1, char *loc2, int length)
{
char *ptr;
ptr = (char *) gtm_malloc(length);
strncpy( ptr, loc2, length);
*loc1 = ptr;
}

int init_table(gtm_string_t *ptr)
{
int i = 0;
char buf[100];
char *buf1, *buf2;
FILE *tablefile;
char *space = " ";
char *errorstr1 = "Error opening table file table.dat";
char *errorstr2 = "UNDETERMINED ERROR FROM GTM_ENV_XLATE";

if ((tablefile = fopen("table.dat","r")) == (FILE *)NULL)
{
ptr->length = strlen(errorstr1);
copy_string(&(ptr->address), errorstr1, strlen(errorstr1));
return 1;
}
while (fgets(buf, (int)sizeof(buf), tablefile) != (char *)NULL) 
{
line= &table[i++];
buf1 = buf;
buf2 =strstr(buf1, space);
line->field1.length = buf2 - buf1;
copy_string( &(line->field1.address), buf1, line->field1.length);
buf1 = buf2+1;
buf2 = strstr(buf1, space);
line->field2.length = buf2-buf1;
copy_string( &(line->field2.address), buf1, line->field2.length);
buf1 = buf2+1;
line->ret.length = strlen(buf1) - 1;
copy_string( &(line->ret.address), buf1, line->ret.length);
}
fclose(tablefile);
/* In this example, the last entry in the table is the error string */
line = &table[4];
copy_string( &(line->ret.address), errorstr2, strlen(errorstr2));
line->ret.length = strlen(errorstr2);
return 0;
}

int cmp_string(gtm_string_t str1, gtm_string_t str2)
{
if (str1.length == str2.length)
return strncmp(str1.address, str2.address, (int) str1.length);
else
return str1.length - str2.length;
}

int cmp_line(line_entry *line1, line_entry *line2)
{
return (((cmp_string(line1->field1, line2->field1))||(cmp_string(line1->field2, line2->field2))));
}

int look_up_table(line_entry *aline, gtm_string_t *ret_ptr)
{
int i;
int ret_v;

for(i=0;i<4;i++)
{
line = &table[i];
ret_v = cmp_line( aline, line);
if (!ret_v)
{
ret_ptr->length = line->ret.length;
ret_ptr->address = line->ret.address;
return 0;
}
}
/*ERROR OUT*/
line = &table[4];   
ret_ptr->length= line->ret.length;
ret_ptr->address = line->ret.address;
return 1;

}

int gtm_env_xlate(gtm_string_t *ptr1, gtm_string_t *ptr2, gtm_string_t *ptr_zdir, gtm_string_t *ret_ptr)
{

int return_val, return_val_init;
if (!init)
{
return_val_init = init_functable(ret_ptr);
if (return_val_init) return return_val_init;
return_val_init = init_table(ret_ptr); 
if (return_val_init) return return_val_init;
init = 1;
}
linetmp.field1.length= ptr1->length;
linetmp.field1.address= ptr1->address;
linetmp.field2.length= ptr2->length;
linetmp.field2.address= ptr2->address;

return_val = look_up_table(&linetmp, ret_ptr);
return return_val;
}
> cat table.dat
day1 week1 mumps
day2 week1 a
day3 week2 b
day4 week2 c.gld

This example demonstrates the mechanism. A table is set up the first time for proper memory management, and for each reference, a table lookup is performed. Note that for the purpose of simplicity, no error checking is done, so table.dat is assumed to be in the correct format, and have exactly four entries. This routine should be built as a sharedlibrary, see Chapter 11: “Integrating External Routines for information on building as a shared library.The function init_functable is necessary to set up the GT.M memory management functions.