Creation of a Backend, From Soup to Nuts
----------------------------------------
By M. Dodge Mumford, May, 1998. Rev 0.5


Executive summary
------------------

This document is for the benefit of those who are also reading Stephen Northcutt's _Establishing a Network Monitoring and Analysis Capability_ (http://www.nswc.navy.mil/ISSEC/CID/cider.htm). It goes through the steps necessary to create a backend in Network Flight Recorder, a network traffic analysis tool. What the backend will do is recognize when an attacker is attempting to use the reader's network as an intermediary for a smurf attack (see Introduction for details).


Introduction
------------

As is true with most of life, the really hard part of finding a solution to a problem is breaking down the problem into small, manageable chunks.  The problem of acting as an accomplice for a smurf attack (http://www.quadrunner.com/~chuegen/smurf.txt) can be broken down by asking the question "What does someone performing a smurf attack need from my network?" The answer is "To be able to use any broadcast address." The best way to tell if you are being used as a relay point for a smurf attack is to monitor your broadcast addresses for use by any IP address that is not on that local network.

Of course, it is rarely that simple. You may, for some reason, want to permit
a specific host to ping an entire subnet (perhaps as a quick way
to find out if all the hosts are alive). There needs to be a way to set up
such exceptions.

The traditional smurf attack uses ICMP echo requests. However, there is no reason why a different IP protocol can't be used, so our backend will look at all IP traffic. 

Please note: You cannot start developing N-Code unless you have a development environment set up first. Please see ADD_URL_HERE for creating a development environment.


What is a backend and how do we create one?
------------------------------------------

Backends exist within packages. A package is a group of backends grouped together for convenience.  Backends within a package can share code, or they can operate totally independently of each other. In this intrusion detection package, each backend works independently of the others.  

For a package to exist, all that is required is a directory name and a .cfg file. The contents of the .cfg file minimally needs to have the text

<pre>
	enabled=true
	title=Title
</pre>

If enabled is set to "false", then none of that package's backends will be loaded into the engine. Optionally, a package can also have a .nfr file, containing package-wide global variables and N-Code. A package can also optionally have a .desc file, which provides a text description of the package and its functions.  

For a backend to exist, it must be located within the package's directory. It must have a .cfg and a .nfr file. A .desc file can also be used.  The .cfg file should minimally look like this:

<pre>
	enabled=true
	title=Title
</pre>

(Hint-- if you don't want the backend to show up on the GUI, comment out the title line).

A very minimalist (and mildly obtuse) .nfr file is:

<pre>
	echo ( "Hello, World!\n" ) ;
</pre>

The next time you start and stop the NFR engine, "Hello, World" will be appended to nfrd.log.


Let's create a backend.
-----------------------

We know that we're only going to want to look at specific IP addresses, so the first thing we should do is define those IP addresses. Then we can print those out to verify they get entered correctly. Create this packages/id/new_badhosts.nfr file:

<pre>
	badhosts = [ 10.0.0.255, 10.0.1.255, 10.0.2.255, 10.0.255.255 ] ;
	echo ( "This is badhosts: ", badhosts, "\n" ) ;
</pre>

Rather than restarting the engine, you can use bin/test-nfrd to verify the syntax and execute any non-packet related functions. If you type this, you should get: 

<pre>
	$ bin/test-nfrd packages/id/new_badhosts.nfr
	This is badhosts: [10.0.0.255,10.0.1.255,10.0.2.255,10.0.255.255]
	$
</pre>

Hint: notice that the variable badhosts is of the type "list". Within a single list, it is possible to have multiple types of variables--integers, IP address, IP networks, strings, and even other lists. An example of a list containing all of the above would be:

<pre>
	fred = [ 16842752, 10.0.0.1, 10.0.0.0:255.0.0.0, "Private Network", 
		[ 10.0.0.0, 10.255.255.255 ] ] ;
</pre>


How do we know when a packet arrives?
-------------------------------------

For this next step, we want our N-Code to execute whenever the NFR engine sees an IP packet. First, we will need to create a .cfg file for this backend.  Since this backend bears an uncanny resemblence to the badhosts backend, we'll call this one new_badhosts. Put the following into packages/id/new_badhosts.cfg:

<pre>
	enabled=true
	title=New Bad Hosts
</pre>

Next, replace packages/id/new_badhosts.nfr with the following:

<pre>
        badhosts = [ 10.0.0.255, 10.0.1.255, 10.0.2.255, 10.0.255.255 ] ;

	filter new_badhosts ip ( ) {
		echo ( "Here 00: ", ip.src, "->", ip.dst, "\n" ) ;
	}
</pre>

To see this take effect, you will need to stop and restart the NFR engine (as root, execute "sh etc/stop_nfr ; sh etc/start_nfr). If you 'tail nfrd.log', you should see many entries like:

<pre>
	Here 00: 208.239.113.130->152.1.58.124
	Here 00: 152.1.58.124->208.239.113.130
	Here 00: 208.239.113.163->208.239.113.161
</pre>

The filter statement says to look at all IP packets, and execute the code between the curly braces.  The parentheses are there because the filter statement is capable of doing basic, preliminary analysis. However, our requirements exceeds that functionality in this case.  

In the echo statement, you see a couple of variables we haven't defined. They are supplied during run time by the engine. Their meaning should be fairly obvious. 


But we only want to know about specific hosts
---------------------------------------------

The next thing we want to do is ignore the packet if it is not one of the ones we're interested in. Make the appropriate changes in packages/id/new_badhosts.nfr so that it reads:

<pre>
	badhosts = [ 10.0.0.255, 10.0.1.255, 10.0.2.255, 10.0.255.255 ] ;

	filter new_badhosts ip ( ) {
		if ( ! ( ip.src inside badhosts || ip.dst inside badhosts ) )
			return ;
	echo ( "Here 00: ", ip.src, "->", ip.dst, "\n" ) ;
	}
</pre>

And restart the engine.  Try pinging 10.0.0.255, and in nfrd.log you should 
see:
Here 00: 208.239.113.163->10.0.0.255
Here 00: 208.239.113.163->10.0.0.255
Here 00: 208.239.113.163->10.0.0.255

The neat thing here is the way we tell if an IP address is inside a previously
defined network--using the keyword "inside". 

The important thing here is to note the use of parenthesis in the if statement.
N-Code does not prioritize logical  operators, so if you were to have said

<pre>
	if ( ! ip.src inside badhosts || ip.dst inside badhosts ) 
</pre>

the end result would almost assuredly not be what you wanted. 


Now it's triggering every time my internal hosts hit the broadcast addresses.
-----------------------------------------------------------------------------

We need a bunch of exceptions, or people who are allowed to communicate with to the broadcast addresses. Modify packages/id/new_badhosts.nfr to look like this:

<pre>
	badhosts = [ 10.0.0.255, 10.0.1.255, 10.0.2.255, 10.0.255.255 ] ;

	exceptions [ 10.0.0.255 ] = [ 10.0.0.0:255.255.255.0 ] ;
	exceptions [ 10.0.1.255 ] = [ 10.0.0.128:255.255.255.128 ] ;
	exceptions [ 10.0.2.255 ] = [ 10.0.2.2, 10.0.2.3 ] ;
	exceptions [ 10.0.255.255 ] = [ 10.0.0.0:255.255.255.0,
		10.0.0.128:255.255.255.128,
		10.0.2.2, 10.0.2.3 ] ;

	filter new_badhosts ip ( ) {
		if ( ! ( ip.src inside badhosts || ip.dst inside badhosts ) )
			return ;
		if ( ip.dst inside exceptions[ip.src] ||
			ip.src inside exceptions[ip.dst] )
			return ;

		echo ( "Here 00: ", ip.src, "->", ip.dst, "\n" ) ;
	}
</pre>

Notice the introduction of arrays. Arrays can be of the types integer, string, IP address, and IP network. Unfortunately, arrays are currently uni-dimensional. 

What is happening in this instance is that the current packet's source and destination IP address pair are compared against the table to see if they are "allowed" to talk with each other. If so, N-Code exits.

Stop and restart the engine to see it work.


Great, so they're communicating. What kind of stuff are they saying?
--------------------------------------------------------------------

Let's say we want to know what IP protocol is being communicated. If it is a protocol that uses ports, we want to record that information, too. And, just for fun, if it's a TCP packet, let's get the TCP flags. A quick glance at /etc/protocols tells us what protocols are valid for IP networks.  Locating and muddling through RFC 793 tells us where within the TCP headers the TCP flags are, and what order they should appear in.

After the exceptions definitions, but before the filter statements, add:

<pre>
	proto_translate[0] = "IP" ;
	proto_translate[1] = "ICMP" ;
	proto_translate[2] = "IGMP" ;
	proto_translate[3] = "GGP" ;
	proto_translate[6] = "TCP" ;
	proto_translate[12] = "PUP" ;
	proto_translate[17] = "UDP" ;
	proto_translate[22] = "IDP" ;
	proto_translate[255] = "RAW" ;
</pre>

and replace the echo statement with (yes, really, all of it):

<pre>
		if ( ip.protocol == 6 || ip.protocol == 17 ) {
			$sport = short ( ip.blob, 0 ) ;
			$dport = short ( ip.blob, 2 ) ;
		} else {
			$sport = 0 ; 
			$dport = 0 ;
		}
		$flagString = "" ;
		if ( ip.protocol == 6 ) {
			$TCPflags = byte ( ip.blob, 13 ) ;
			if ( $TCPflags & 1 ) 
				$flagString = cat ( $flagString, "fin " ) ;
			if ( $TCPflags & 2 ) 
				$flagString = cat ( $flagString, "syn " ) ;
			if ( $TCPflags & 4 ) 
				$flagString = cat ( $flagString, "rst " ) ;
			if ( $TCPflags & 8 ) 
				$flagString = cat ( $flagString, "psh " ) ;
			if ( $TCPflags & 16 ) 
				$flagString = cat ( $flagString, "ack " ) ;
			if ( $TCPflags & 32 ) 
				$flagString = cat ( $flagString, "urg " ) ;
		}

		if ( ! ($protoString = proto_translate[ip.protocol]) ) {
			$protoString = "Unknown" ;
		}
		echo ( "Here 00: ", ip.src, "->", ip.dst, ": ", 
			$protoString, "\n" ) ;
</pre>

Restart the engine, ping the broadcast address, try telnetting to it, and in nfrd.log you should see something like:

<pre>
	Here 00: 208.239.113.163(0)->10.0.0.255(0): ICMP 
	Here 00: 208.239.113.163(0)->10.0.0.255(0): ICMP 
	Here 00: 208.239.113.163(7865)->10.0.2.255(23): TCP syn 
	Here 00: 208.239.113.163(7865)->10.0.2.255(23): TCP syn 
</pre>

Interesting things we've added here that are noteworthy:

ip.protocol is a pointer to the appropriate byte of the IP packet, as described in RFC 791.

ip.blob contains the entire contents of the IP packet payload. According to RFCs 793 and 768, if the IP packet is of the type TCP or UDP, the first 16 bits (or 2 bytes, or the first short) contain the source port number. The second 16 bits (or 2 bytes, etc.) contain the destination port number. The term "blob" is synonymous with the word "string" within N-Code.

You may have also noticed the "short" and "byte" functions. They return the short value (16 bits) or byte value (8 bits) of a blob, starting at a specific position.

With the ports, we also introduce a new kind of variable--one that begins with a dollar sign ($). All other variables we have dealt with thus far have either been defined automaticaly by the system (e.g. ip.src), or have been defined globally, outside of a function or filter statement (e.g.  exceptions[]). Variables that start with a $ are local to that function, and cannot be accessed from outside that function.

We have also introduced bitwise operators. '&' is a bitwise 'and' . For example, ( 1 & 1 ) returns true,  ( 2 & 1 ) returns false, ( 3 & 1 ) returns true, and so forth. '|' is 'or', and '^' is xor,  


Cool! How do I get it to record this stuff to disk?
---------------------------------------------------

Now things start getting a little more complicated, but not horribly so.  The first thing we're going to do is decide exactly what to record, field by field. They will be recorded thusly:
	- source IP address
	- source IP port
	- destination IP address
	- destination IP port
	- IP protocol
	- TCP flags

First, change packages/id/new_badhosts.cfg (yes, .cfg, not .nfr). The .cfg file describes to the GUI the format in which data has been recorded. It should end up looking like this:

<pre>
	enabled=true
	title=Title
	
	gui=list
	# implicit zero-eth column is time
	num_columns=6
	column_1_type=p_src_ip
	column_2_type=p_src_port
	column_3_type=p_dst_ip
	column_4_type=p_dst_port
	column_5_type=p_string
	column_6_type=p_string
	
	column_1_label=Source Addr
	column_2_label=Source Port
	column_3_label=Dest Addr
	column_4_label=Dest Port
	column_5_label=Protocol
	column_6_label=TCP Flags

	rollover_size=YES
	rollover_size_val=1024000
	rollover_time=YES
	rollover_time_val=300000
	archive_path=data/%p/%b/%y/%m%d/
	cfversion=1
</pre>

In the default distribution, there are two different types of recorders: list and histogram. The list recorder is the simpler of the two; it is more or less like a sequential database. When you query it, you get a list of all the events matching your criteria. Histogram, on the other hand, is designed to count events. When queried for only a specific IP address, for example, it will show the IP address and the number of times that IP address has been recorded. We will write this backend to use the list recorder.

num_columns defines the number of fields that are being recorded. Note that each record automatically gets recorded with the system time, and that its presence is assumed.

The column_n_type variable refers describes the type of data held in that field.  column_n_label gives a title to the field for the GUI.

If we tried to keep the data forever, the disk would fill up quickly.  Spaceman, the space management utility, will look at rollover_size and rollover_time to decide when to rotate and archive the data.

The enginne will look at archive_path to find out where the data should be saved. Unless the string starts with a slash, the $NFRHOME directory is assumed to be the root directory. The various % macros should be used consistently, so we know where the data is. 

The current revision of the format of the data should be kept in cfversion.

Now edit packages/id/new_badhosts.nfr (yes, .nfr). Add the following after the "proto_translate" but before the "filter" statements.

<pre>
	new_badhosts_schema = library_schema:new ( 1, [ "time", "ip", "int", 
		"ip", "int", "string", "string" ], scope() ) ;
	new_badhosts_recorder = recorder ( 
		"bin/list packages/id/new_badhosts.cfg", 
		"new_badhosts_schema" ) ;
</pre>

Also, after the echo statement, add:

<pre>
	record system.time, host(ip.src), $sport, host(ip.dst), $dport, 
		$protoString, $flagString to new_badhosts_recorder ;
</pre>

badhosts_schema is a variable that tells the recorder what format to record the data in. Note that the first argument is always 1--this is for future expansion. The next argument is time--remember how in the .cfg file it was stated that the zero-eth implicit column is system time? The last argument is the scope of the current function, easily referenced through the scope() function.

badhosts_recorder is a variable that contains the information about the recorder, like where it is, what configuration file to use, and the name of the variable that contains the schema.

The record statement actually records the information to disk. 

To verify, restart the engine, start a Java enabled Web browser, start the GUI, select the backend, and press the "Query" button.  When the next window appears, click on either the 'Display as Text' or 'Display as HTML' button.  You should see data like: 

<pre>
	Thu Apr 30 16:45:59 1998 208.239.113.163            0 10.0.0.255                0 ICMP                                                          
	Thu Apr 30 16:46:00 1998 208.239.113.163            0 10.0.0.255                0 ICMP                                                          
	Thu Apr 30 16:46:01 1998 208.239.113.163            0 10.0.0.255                0 ICMP                                                          
	Thu Apr 30 16:46:10 1998 208.239.113.163         8256 10.0.2.255               23 TCP                            syn                            
	Thu Apr 30 16:46:13 1998 208.239.113.163         8256 10.0.2.255               23 TCP                            syn                            
	Thu Apr 30 16:46:19 1998 208.239.113.163         8256 10.0.2.255               23 TCP                            syn                            
	Thu Apr 30 16:46:27 1998 208.239.113.163         8275 10.0.1.255               23 TCP                            syn                            
</pre>


Amazing! How do I trigger an alert on this?
-------------------------------------------
At the top of the file, add 

<pre>
	MyAlertContext = alertContext(5);
</pre>

After the record statement, add this:

<pre>
                $message = cat ( $protoString, "  ", $flagString  ) ;
                alert ( _:BAD_HOST_BACKEND, _:BAD_HOST, MyAlertContext,
                        ip.src, ip.dst, $message ) ;
</pre>

And now for added fun, create the file packages/id/new_badhosts.acf and put this into it:

<pre>
	acfhdr {
		version = 1;
		fixed = TRUE;
	}

	alert_source BAD_HOST_BACKEND {
		shortname = NEW_BAD_HOST_BACKEND;
		longname = "Bad Host Backend";
	}

	alert NEW_BAD_HOST {
		rules = networklist popup;
		severity = SEV_ATTACK;
		format = "$(1) attempted to contact $(2): $(3)";
	}
</pre>

The alert statement sends the IP addresses and the contents of $message to the alerting system. The source of the alert is NEW_BAD_HOST_BACKEND and a alert of NEW_BAD_HOST. Those are defined in the .acf file. The alert will be sent to the networklist alert as well as the popup window.

The alertContext is used to prevent the backend from triggering too many alerts in a short time period. By setting MyAlertContext = alertContext(5), and calling the alert() function with MyAlertContext as the third variable (after the alert source and the alert), it guarantees that the alert will go out no more than once every five seconds.

The structure of the .acf file requires some description. First, whitespace doesn't matter. Comments start with a '#'.  The acfhdr section needs to be there. 'version' must always be 1, and 'fixed' should always be "TRUE". Both entries need to be there. Should the format of this file change, the version number will increment. There should only be one acfheader (alert configuration header) in the file.  You may have as many alert_source sections as you want. You may also have as many alert sections as you want. They can be in any order.

Now, restart nfr and try pinging one of the bad hosts. If you go into the alert section of the GUI, you should see your alert.


Wonderful! How do I set it up to e-mail me?
------------------------------------------
Log into the GUI as a user with administrator priveleges. Go to Administration / Alert Configuration. Go into the group All. Find the alert NEW_BAD_HOST, and right-click it. Select Rules. Click the "New" button. Select "E-mail" and hit the "OK" button. Give the rule a name like "Email_Me", and put your email address into the Recipients field. Latency and Alerts/Message describe how long an alert will stay in the queue before it is emailed. Whichever limit is exceeded first (by default either fifty alerts or 300 seconds (five minutes)) will cause the email to be sent. For testing purposes, set latency to 15 seconds. In a production setting, it's probably best to keep it at 300 seconds but set the Alerts/Message to a very high number--you don't want to flood your mail server. 

Click OK, and you should see Email_Me in the Available Rules list. Highlight it and click the "Add ->" button. Then click OK.

Try a ping and a few telnets. In a few minutes, you will receive an e-mail message regarding your activity.


Conclusion
----------

This backend barely touches the tip of what NFR can do. We have not looked at the payload of the packet at all. We have filtered only on IP packets--it is possible to look at the ethernet frame, as well as follow a TCP stream. But this should describe in suitable detail how to create a new backend that looks at those other network layers, you will just have to look up the specific variable names.  You will also want to look at the various backends included with the distribution as examples.  

Good Luck!


COPYRIGHT NOTICE
----------------
(C) 1998, 2000 Network Flight Recorder, Inc. All rights reserved. This document
may be distributed freely as long as this copyright notice remains intact.