Java Tracx - XML tracing language

Users guide

by Bjoern Lemke, Draft 27.11.2008

Tracx comes under the GNU Copyright
(C)opyright 2006, 2007, 2008 by Bjoern Lemke


The java implementation of tracx comes with the same features as the C++ implementation. In addition, it enables the user to retrieve data via any JDBC connection from any database engine. In this sense, Java tracx can be used as a integration bridge between databases or any other legacy systems. Please note, there where made some small syntax changes in terms of execute and dbquery statements.

Table of Contents (TOC)

1. Who should use it ?
2. Getting Started
3. Loading data from XML documents
4. Saving data to XML documents
5. Shell execution
6. Query execution
7. Loops and conditions
8. Assignments and expressions
8.1. ... Document assignments
8.2. ... Expression assignments
8.3. ... Content assignments
8.4. ... Element assignments
8.5. ... Passing command line arguments
9. Procedures
10. Threads
11. Build-In functions
11.1. ... gettoken
11.2. ... truncright
11.3. ... truncleft
11.4. ... date2int
11.5. ... replace
11.6. ... replaceall
11.7. ... now
12. Beyong the basics
12.1 ... Number of subelements
12.2 ... Including files
12.3 ... Printing output to files
12.4 ... Deleting elements
12.5 ... Adding elements
12.6 ... String escape characters
13. Restrictions
14. Conclusion

1. Who should use it ?

Back to TOC

tracx is an appropriate tool for those system administrators, who need to retrieve any kind of system information via standard unix shell commands and store the data in a XML based datastructure. tracx also supports a cgi-mode, which enables the programmer to call tracx as a cgi program for HTML page generation. To use java tracx, you should be familiar with the java interpreter, the unix shell, JDBC connection parameters but also with the concept of XML and general programming concepts. tracx is also a more complex programming example for the usage of the dragon parser generator which can also be downloaded as an opensource product (www.lemke-it.com).

2. Getting started

Back to TOC

To get in touch with the tracx programming paradigm, we start with a small sample program, that just prints a 'hello world' string to the console.
--
-- Tracx Hello World sample 
--
program HelloWorld

print stdout 'Hello world\n';

end
A tracx program starts with the keyword PROGRAM followed by a number of tracx statements. The program is closed with the PROGRAM END statement. Any lines starting with the '--' string, are ignored as program comments. To execute the program, we use the java interpreter as show in the following
$ java -classpath jdom.jar:jtracx.jar de.lemkeit.tracx.TracxMain -f helloworld.tracx
Hello world
$
Please note: As an utility library for XML handling, the JDOM Java library is required. You may download this library as opensource from the JDOM Homepage

3. Loading data from XML documents

Back to TOC

One of the major feature of tracx is the XML document handling. tracx enables the user to dynamically load and store user data in XML format for further processing. In this way, any kind of valid XML document can be loaded by tracx for tracing or manipulation.
<?xml  version="1.0" ?>
<!DOCTYPE SAMPLEDOC>
<root name="this is the root">
	<sub name="alpha">
		<subsub name="deeper">
	</sub>
	<sub name="beta">
	<sub name="delta">
</root>
As a sample XML document, we use the file rdoc.xml as printed above. The following sample tracx program loads the XML document and traces through all sub elements of the root element.
program LoadDoc

document tdoc;

element root;
element sub;

load #tdoc from 'rdoc.xml';

$root = #tdoc[root];

for $sub in $root.sub
begin
	print stdout 'Tracing ' + $sub(name) + '\n';
	element subsub;
	for $subsub in $sub.subsub
		print stdout 'Sub Tracing ' + $subsub(name)+'\n';
	end;	       
end;

end
As we have already seen, a tracx program starts with the PROGRAM keyword followed by the name of the program. Then a declaration of the required document and elements is followed. A document variable is needed by tracx to load a xml document and provide a reference to it. Element variables are used as references to xml tags inside a loaded document.

In the sample above we declare the document tdoc and two element variables root and sub.

With the next LOAD-statement, the document rdoc.xml is loaded into the tdoc document. Please note, the document reference is indicated by a '#' prefix sign. As next, the root element is assigned to the documents element. For this, the document reference is called with the valid root element ( included in [..]- brackets ). Please note, that element references are indicated by a '$' prefix sign.

Now root helds a valid pointer to the documents root element.

With the next for-loop-statement, we can trace through all root sub sublements . The print statement prints out some static text combined with the name attribute value of the current sub element.

To trace more deeper into the XML tree, we declare another element called subsub. With the nested loop, we trace all available subsub elements for each sub element and print out the name attribute value.

4. Saving data to XML documents

Back to TOC

New created data within a tracx program is just stored in memory. To save it,it has to written down to a file. Another sample will demonstrate this scenario.
program SaveDoc

document tdoc;
element root;
element sub;

$root(name) = 'this is the root';
$root(number) = 3;

$sub = $root new sub;
$sub(name)='alpha';
$sub = $root new sub;
$sub(name)='beta';
$sub = $root new sub;
$sub(name)='gamma';

#tdoc = $root;

save #tdoc to 'wdoc.xml';

end
In the program above, we see the common program header with a declaration of one document variable tdoc and two element variables root and sub.

The next statement assigns an attribute value called ( name ) to the root element. Assigments are either string assignments or integer assignments. The second assign statement, assigns an integer value to the number attribute. Later on, we will have a more detailed discussion how to handle integer and string values in more complex expressions.

To create a new element, the NEW-Statement is used. This creates a new sublement under the specified root element. The result of the assigment is a reference to the new created (sub) element. Further subelements can be created as needed.

After all data has been assigned, it can be stored to a XML file. For this, the root element must be assigned to the document variable. ( Please note, that the documents variable prefix is '#' ).

Now we write down the document to file wdoc.xml with the SAVE statement.

5. Shell execution

Back to TOC

Now we will discuss the execution of shell commands for data retrieval and how to deal with the received fanout. For this, we execute a ls command and print out the retrieved output.
program ExecuteCmd

element fanout;
element fname;

execute shellcmd='/bin/bash' cmd='ls' host='local' separator=' \t' into $fanout ls name=1;

print stdout 'ReturnCode=' + exitCode + '\n';

for $fname in $fanout.ls
begin
	print stdout 'FileName=' + $fname(name) + '\n';
end;
end
The execute command expects several input parameters. They are given as a property assignment list as shown above. The following properties are available
Property Meaning
cmd The shell command which should be executed
host Name of the host, where the command should be executed. If you choose 'local' for the host value, the command is executed locally.
user In case of remote shell commands, the user property specifies the remote user
separator Definition of the separator signs, the fanout should be separated
sshcmd Selection of the execution shell, which should be used for remote execution calls ( e.g. ssh )
shellcmd Selection of the execution shell, which should be used for local execution calls ( e.g. bash )
In our sample, the execute command is a plain ls command, that lists all files in the current working directory of the calling shell. The command is executed locally using a the standard shell /bin/bash. Please note that each remote execute command requires a configured ssh with authorized access to the calling hosts ( password free ). If executed successful, the retrieved data is stored in the fanout element variable in the following way. For each retrieved line, a new sublement ls is created for the fanout root element. The filename information, occuring in the first column of each output line, is assigned to the attribute name of the sublement. The column separator is specified with ' \t', which means, that space and tab signs separate each column in a line. As last, the retrieved fanout data is traced in the for-loop. The return code of the executed command is stored in the exitCode variable.

6. Query execution

Back to TOC

Query execution is a new dedicated feature of the java version of tracx. You may use any kind of JDBC connection to retrieve data for tracx processing. To use a database query, you first have to establish a JDBC connection to a running database. This is done using the dbopen statement. An appropriate URL, username and password has to given as arguments to the statement. Now a query can be executed using the dbquery statement. The query attribute of the statement should contain a valid SQL statement. If the query is use for data retrieval, you should specify a valid fanout element to store the output data. This is done in the same way, as this is done for shell execution output.
program ExecuteQuery

element fanout;
element row;

dbopen DBCON as ( 'cegojdbc:geek:2200:dab1', 'lemke', 'lemke' );

dbquery DBCON query = 'insert into t1 values ( 1, 'xxx')';
dbquery DBCON query = 'select * from t1' into into $fanout row name=1;

for $row in $fanout.row
begin
	print stdout 'row=' + $row(name) + '\n';
end;

dbclose DBCON;

end
To use any JDBC connection, an appropriate driver must be provided to tracx. This is done with the command line parameter -jdbc. The following wrapper shell script may be adapted and used to call tracx in a more convenient way.
#!/bin/sh

JAVA=java

# customize here the appropriate JDBC classes
JARFILES=./jtracx.jar:./jdom.jar:./cegojdbc.jar

# customize here the appropriate JDBC driver
JDBC_DRIVER=de.lemkeit.cegojdbc.CegoDriver

$JAVA -classpath $JARFILES de.lemkeit.tracx.TracxMain -jdbc $JDBC_DRIVER $@
It is possible, to use more than one JDBC driver in the same Tracx program. For this, all required JDBC drivers must be specified with the jdbc argument using the column sign as a separator
... -jdbc de.lemkeit.cegojdbc.CegoDriver:oracle.jdbc.driver.OracleDriver

7. Loops and conditions

Back to TOC

Like the most other programming paradigms, tracx provides loops and conditions. This enables a tracx programmer to trace through any XML data structure. In the previous samples , we already have seen the usage of for-loops. The following sample also includes some if-statements.
program LoopAndCondition

document tdoc;
element xmltest;

load #tdoc from 'doc.xml';

$xmltest = #tdoc[root];


$xmltest(chk) = 1;
$xmltest(doif) = 'yes';

if $xmltest(doif) = 'yes'
then
   element sub;
   for $sub in $xmltest.sub
   begin
      if $sub(name) = 'alpha' or $sub(name) ~ 'beta'
      then
         print stdout 'Found alpha or beta ....\n';
      elsif $sub(name) = 'delta'
      then
         print stdout 'Found delta ....\n';
      elsif $sub(name) = 'gamma' and int $xmltest(chk) = 1
      then
         print stdout 'Found gamma\n';
      else 
         print stdout 'Nothing found\n';
      end;
   end;
end;

end
The IF-Condition contains a number of predicates in conjuntion with AND- or OR-operations. Either string or integer comparisons are allowed inside the predicate. All standard comparisons are supported ( =, !=, <, >, <=, >=). Furthermore, the comparison with '~' can be used for pattern matching with regular expressions

8. Assignments and expressions

Back to TOC

For the manipulation of any XML data structure, it is essential to provide a powerful set of assignments and expression.

8.1. Document assignments

Back to TOC

To reference elements to XML documents, root elements can be assigned to a document s root element. To assign a document to an element, a valid root element must be specified. Document assignments to elements are required to get a reference of the documents root element for further program processing. Element assignments to documents are required to link a documents root element to a valid element reference. In general, this is done before saving the document to a file.
program AssignDocument

document d1;
element e1;
element e2;

$e1 = #d1[root];
#d1 = $e2;

end
Please note that the assigned root element of the document must be exist in the xml file loaded to the document.

8.2. Expression assignments

Back to TOC

The attribute of an element can be set up with an appropriate value. For this, string and integer expressions are provided.
program AssignExpr

element e1;
element e2;

$e1(a) = '1';
$e2(b) = '2';
$e1(c) = $e1(a) + '2';
$e1(d) = int $e1(a) + 2;
$e1(e) = int $e1(d) + int $e1(a);
$e1(f) = @0;

end
For integer operations, you have to cast the corresponding attribute value with INT to an integer value.

8.3. Content assignments

Back to TOC

The content of an element can be set up with an appropriate value. For this, two kind of content assignments are available.
program AssignContent

element e1;
content $e1 = 'this is content';

print stdout 'This is the content : ' + content $e1;

end

To define cdata section for any element, cdata assignments can be used.
program AssignCdata

element e1;
cdata $e1 add 'this is first';
cdata $e1 add 'this second first';

print stdout 'This is the first cdata : ' + cdata[0] $e1;
print stdout 'This is the second cdata : ' + cdata[1] $e1;


end

8.4. Element assignments

Back to TOC

Finally, assignments also can made for elements. This just assigns element references to the specified element target.
program AssignElement

element e1;
element e2;
element e3;

-- create new sub element 'sub' in e2
$e2 = $e1 new sub;
-- set e2 to the first occurrence sub element 'sub'  
$e2 = $e1.sub;
-- set e3 to e1 
$e2 = $e1;

end
Note: No copy of the element is created, just a new reference to the specified element assigned. To create a new element, the NEW operator must be used.

8.5. Passing command line arguments

Back to TOC

Sometimes, it may be useful to pass arguments from the program command line to the program. You can do this using the -a command line argument. All subsequent parameters are passed to the tracx program and can be referenced by using the @ reference operator.
#!/bin/sh

JAVA=java

# customize here the appropriate JDBC classes
JARFILES=./jtracx.jar:./jdom.jar

$JAVA -classpath $JARFILES de.lemkeit.tracx.TracxMain -a alpha beta gamma
See below the corresponding tracx program for using the specified command line arguments
program CommandLineArgs

print stdout 'Arg 0 = ' + @0 + '\n';
print stdout 'Arg 1 = ' + @1 + '\n';
print stdout 'Arg 2 = ' + @2 + '\n';

end

9. Procedures

Back to TOC

To get a more modular code design, tracx provides procedures, which can be feed with element references. This enables the programmer to encapsulate code segments into dedicated blocks. The sample below shows the usage of procedures.
procedure MyFirstProc(param1)
begin
	print stdout $pp(parent);
	if $param1(a) = 'yes'
	then
		print stdout 'MyFirstsProc ok\n';
	end;
end

procedure MySecondProc(param1 , param2)
begin
	print stdout 'Param1=' + $param1(a) + '\n';
	print stdout 'Param2=' + $param2(b) + '\n';
end


program ProcSample

element p1;
element p2;
element pp;

$pp(name)='I am parent element';
$p1(a) = 'yes';
$p2(b) = 'no';

call MyFirstProc( $elem);
call MySecondProc( $elem, $elem);

end
tracx procedures are declared at the beginning of the program. It is recommended to locate the procedure implemention into a dedicated file and include the file with the #inc-preprocessor instruction into the program ( see section Beyond the basics ).

In the procedure header, a number of element parameters can be declared. If a procedure is called, the appropiated number of parameters must be given with the procedure call. Procedures are called with the CALL-Statement followed by the procedure name and the element parameters.

Parent elements from the calling programming block can also be accessed. This is demonstrated in the sample above with the pp element.

10. Threads

Back to TOC

For parallel processing, tracx provides the usage of lightweight processes (Threads). A thread is defined in a thread block and can later be instantiated and synchronized.
sem s1;

thread T1(p1) limit 10 tiemout 40
begin

	$p1(sleep) = 3;

	print stdout 'This is the thread ' + $p1(name) + '\n';

	element dummy;

	lock s1;

	execute 'sleep ' + $p1(sleep) on 'root'@'local' into $dummy dummy dummy=1;

	print stdout 'Again this is the thread ' + $p1(name) + '\n';
	$p1(done) = 'yes';
	unlock s1;
end

program ThreadSample

document tdoc;
element xmltest;
element tag;

load #tdoc from 'rdoc.xml';

$xmltest = #tdoc[root];

for $tag in $xmltest.sub
begin
	print stdout 'Starting thread ' + $tag(name) + '...\n';
	start T1( $tag );
 	print stdout '... started\n';
end;

print stdout 'Waiting for thread termination ...\n';
sync T1;
print stdout '... all threads terminated\n';

end
In the thread declaration header a limit and timeout value must be defined. The limit value defines the bandwith of parallel threads. If the bandwith of actual running threads has been reached, subsequent threads are queued and just started if bandwith is available again through the termination of previous started threads. The timeout value forces a thread termination after the defined value ( in seconds ). This makes sure, that no hanging threads will cause a hangup of the complete tracx program.

Threads are started whith the start-Statement followed by the thread name and the appropriate element argument ( argument handling is the same as for procedures ).

To synchronize the started threads, the sync statement is used. The statement waits, until all incarnations of the specified thread are terminated.

For synchronization inside the thread, mutex semaphores can be used. With the semaphore declaration at the beginning of the file, a number of semaphores can be declared.

To lock a data segment for exclusive use, a lock statement to the appropriated semaphore is executed before accessing the data. After the use of the data, an unlock statement enables other threads to access the data.

11. Buildin functions

Back to TOC

For advanced usage, tracx provides a number of build-in functions which are explained in short in this section

11.1. gettoken

Back to TOC

If a string value should be divided into separated token values, the gettoken function can be used. The function is called with three parameter values: the source string, the token separator and the token position, which should be accessed. Sample:
$tok(val) = gettoken('a,b,c,d', ',', 3);

11.2. truncright

Back to TOC

With the truncright function, signs can be truncated from the right side of the source string. Sample:
$tok(val) = truncright('Trunc Itxxx', 'x');

11.3. truncleft

Back to TOC

With the truncleft function, signs can be truncated from the right side of the source string. Sample:
$tok(val) = truncleft('xxxxTrunc It', 'x');

11.4. date2int

Back to TOC

With the date2int function, date values can be converted to integer values. Sample:
$e1(dateval) = '12:22:03 21 01 07';
$e1(intval) = date2int( $e1(dateval), '%H:%M:%S %d %m %y');
print stdout 'dateval=' + $e1(dateval) + '  intval=' + $e1(intval) + '\n';

11.5. replace

Back to TOC

The replace functions replaces the first matching substring in the given string parameter
$e(target) = replace($e(source), $e(repsource), $e(reptarget));

11.6. replaceall

Back to TOC

The replaceall functions replaces all matching substring in the given string parameter
$e(target) = replaceall($e(source), $e(repsource), $e(reptarget));

11.7. now

Back to TOC

The now functions returns the current date in string format to the caller
$tok(val) = now;
If a special date format is required, a date and time can be given with
$tok(val) = now [ 'yyyy-MM-dd HH:mm:ss.S'] ;
The date and time pattern is in accordance to the Java class SimpleDateFormat

12. Beyond the basics

Back to TOC

If you have get familiar with the basic features of the tracx language, the following hints may help you for more advanced usage

12.1. Number of subelements

To count the number of subelements in an element variable, you can get this by double prefix the element with the corresponding sublement Sample:
$root(subcount) = $$root.sub;

12.2 Including files

With the #inc preprocessor instruction additional files can be included into a tracx program file. This is useful to maintain larger tracx programs.
#inc myprog.tracx

program IncludeSample
-- Here comes the main part
-- ...
end
A common usage for the #inc preprocessor instruction is to maintain tracx code of procedures in separated files. The #inc preprocessor instruction can be used in a nested way ( included files can contain further #inc instructions ).

12.3. Printing output to files

To write data into an output file, a file descriptor handling can be used. Outfiles must be opened by an open statement and if written the file has finished, the file descriptor can be closed. The following sample illustrates the usage
program OutFile

open myfile AS 'myfile';

print myfile 'This is output to myfile\n';

close myfile;

end

12.4. Deleting elements

To delete elements from an element structure, the delete statement can be used. The delete statement removes the referenced element from the parent's list. The delete statement can also be used in a for loop to delete the element for any condition. Since the loop invariant helds a copy of the sublist and the element is accessed just at the beginning of the loop, the deletion works.
program DeleteElement

document tdoc;
element xmltest;
element sub;

load #tdoc from 'doc.xml';

$xmltest = #tdoc[root];

for $sub in $xmltest.sub
begin
   if $sub(name) = 'beta'
   then
      print stdout 'Deleting ...\n';
      delete $sub;
   end;
end;

save #tdoc to 'doc.xml';

end

12.5 Adding elements

Sometimes, it is required to add several elements to a parent document for information consolidation. This can be done using the add assignment operation. This assignment adds a reference of the operators right side to the operators left side element.
program AddElement

document doc1;
document doc2;
document doc3;

element d1;
element d2;
element d3;

load #doc1 from 'doc1.xml';
load #doc2 from 'doc2.xml';

$d1 = #doc1[root];
$d2 = #doc2[root];

$d3 add $d1;
$d3 add $d2;

#doc3 = $d3;
save #doc3 to 'doc3.xml';

end
In the sample above, two documents doc1.xml and doc2.xml are concatenated to the new document doc3.xml.

12.6. String escape characters

String values are encapsulated in single quote signs ( ' ). Sometimes it is required to use special characters in a string which must be escaped. To escape a character, a leading backtracx must be used as a prefix character. The following characters require an escape

print stdout 'This is a sample for printing tab\t quote \' , backslash \\ and finally newline \n';

13. Restrictions

Since tracx use dot-tokens ( '.') to separate subelements, dots are not allowed in element identifiers. Instead, element definitions must be alphanumeric. This is a restriction not included in the XML standard definition.

14. Conclusion

tracx provides access to XML based data structures in a kind of scripting language. This provides a platform independent usage of tracx programs. Other than common shell scripts, tracx's focus is the structured XML and in memory data access more than forking any kind of unix process or piping output to another process. The tracx software is free for download under the GNU public license from www.lemke-it.com. If you have built and tested the tracx software and find it useful ( or not ), any comments and feedback is appreciated. Thanks in advance and have fun !