Inserting and Creating Labels
To insert a label, we provide the dp_label_flows.rb tool. With this tool you can create new anomaly types, such as a new attack that is not in the ANOMALIES table, as well as associate flows and a description with the label.
The first thing to do before running dp_label_flows.rb is to extract the attack flows from the FLOWS table and place them in to a file with the format:
<interval_as_timestamp>,<flow_id>
This file will be used by the script to associate the attack flows with the label you create. This allows users to not only know where attacks are labeled in the data set, but what flows were associated with the attack. Again, this is useful for synthetic attack analysis and generation with real attack flows. We will use the following example attack flow file for our example, which would be indicative of an attack that start at the end of the 2005-02-01 00:00:00 interval and carried over to the next:
$ cat attack_flows | head -n 5 2005-02-01 00:00:00,1000 2005-02-01 00:00:00,1001 2005-02-01 00:05:00,0 2005-02-01 00:05:00,1 2005-02-01 00:05:00,2
The next thing to do is to run the dp_label_flows.rb tool. You will receive a list of attack types that already exist in the database and asked to select one of them which describes the attack you are labeling. Select the attack type that fits your attack closest. If one does not exist, you can add one by using -1 as the attack type. After selecting the attack, you will specify the attack flows filename, which was attack_flows in our example:
$ ./dp_label_flows.rb
ID | Description
------|-----------------
1 | inbound bandwidth flood
2 | outbound bandwidth flood
3 | inbound worm activity
4 | outbound worm activity
5 | inbound horizontal scan
6 | outbound horizontal scan
7 | inbound vertical scan
8 | outbound vertical scan
Enter an ID which describes the type of attack you are labeling (-1 for new label): 2
Enter a filename which includes a list of flows associated with the attack (FILE FORMAT: <interval>,<flow_id>): attack_flows
After specifying the attack flows filename, your favorite $EDITOR will open and you can enter the description of the attack. For example, we enter:
- increased traffic on port 80 (magnitude larger than normal) - destined to the single host: 402737628 (internet) - 22 flow sizes > 755191 - two Intranet hosts generate all of this additional traffic: 179858, 147761 - does not show up in degree since these two hosts only contact 4 unique hosts
After you save the label you should see something like:
done! Your label id is: 1
Note that we discourage creating specific attack types such as "blaster worm" because there is currently no support to label these attack types in another hierarchy such that a user could select "labeled worms" and get the blaster worm. This is functionality we could add in the future, but there is currently no need for it. Instead, you should add "blaster worm" to your attack description and we currently support searching through attack descriptions which users could use to find more specific attack types.
