0

nature’s signatures

-

One more in the list of technical posts! Yesterday was a day of 17 hrs in the lab (phew :-)
So we were capturing packets but the packit tool did not randomize the source IPs enough so we were getting decent signatures for TCP traffic but not for ICMP! So looking at the signature generation I found that the checksum was also being used to get the hash value. But, when I stopped using the checksum values for generation of hash the signatures started coming properly. Antoine, somehow, thought that the IP addresses were affecting the has values that we got. But looking deeply into the code we saw that it was not the case. The conclusion (which is really surprising) is that packit was generating similar packets quite a few packets and that too from the same source IP (but they really should have been randomized!)…I don’t know whether this conclusion is correct??? May be some packit developers would be able to help me on this!
So now the challenge becomes to send those ICMP signatures across…but icmp_send() method requires skbuff structure…I looked at the net/ipv4/ipip.c file for the usage of icmp_send() methods but it is still not clear to me how it should be used!

-Rajat
Rajat’s Homepage

3

Projects…interjects…Part Deux!

-
log

Yesterday was a fantastic day trying to get the IP data field to be pointed to by in the skbuff structure. The documentation in the files did not help make things clearer.
The situation was where I was supposed to use the
unsigned char* data field in the skbuff structure to point to the IP data starting point.
Tried a lot of pointer math and the following finally worked:
IP Data pointer location:

unsigned char * ptr = sb->data +
sb->nh.iph->ihl*4;
int byte_size = ntohs(sb->nh.iph->tot_len) -
sb->nh.iph->ihl*4;

In fact, Vinay Reddy (vinayvinay@gmail.com) suggested something which I think was even better than the stuff that was working for me. He said the pointer value should be:

unsigned char * ptr = sb->nh.iph->raw + sb->nh.iph->ihl*4;

I think this actually grabs the gist of what I exactly want to do.
I *really* want to point with respect to the IP Header. I do not really care about where sb->data really points to so I guess Vinay’s method is much better. Haven’t implemented it so I really don’t know but sounds the most logical!

- Rajat
http://www-scf.usc.edu/~swarup/

1

Brand New Day

-

It’s a brand new day with no novelty! Back to the lab today trying to now get access to the packet data to calculate the hash values. I suspect that inside netfilter’s sk_buff structure there’s an unsigned char* data field. This probably is exactly what I need to get the hash values. There’s this awesome link which has great information about sk_buff structure. The unsigned int len; has the size of the complete input data including the headers. I guess if this len value == size of the actual data for the IP header (which could be TCP header / UDP header / ICMP header) then if we are using chunks of this data to find hashes then the following algorithm could be used:

no_of_chunks = len / BYTE_SIZE_FOR_SIGN;

addendum = len % BYTE_SIZE_FOR_SIGN;

for (int i = 0; i < no_of_chunks; i++){  storeInTable(hashRabin(data,i*BYTE_SIZE_FOR_SIGN,               (i+1)*BYTE_SIZE_FOR_SIGN - 1 ,0));}storeInTable(hashRabin(data,no_of_chunks*BYTE_SIZE_FOR_SIGN,        no_of_chunks*BYTE_SIZE_FOR_SIGN+addendum, 0));


This are my initial thoughts let’s see how it works out!

-Rajat.
Rajat’s Homepage

0

Die Another Day!

-

Back again in the lab to get the module completed as this part needs a lot of effort.
The RabinHash available at Jaspell was very helpful in getting me started with the actual coding of the whole thing. Now since the Rabin Hash values are really varied I need to first figure out ways to search the packet hash presence effectively. Today I’ll try an idea where I’ll mod out the hashes to 3 distinct prime numbers and see the values they hash to. These indices from the table of pointers would point to respective hash values.

           mod p1                          mod p3      |_______________|                |_______________|      |_______________|--->[val1]<--+  |_______________|      |_______________|--->[val2]   |__|_______________|      |_______________|--->[val3]<--+              :          :   :      |         :              :          :   :      |___      :

First I needed to read through how kernel memory allocation works.
Kernel Korner – Allocating Memory in the Kernel | Linux Journal was a fantastic link that got me right into the mem allocation principles!.
Let’s see how the day goes!

0

Nutch…too much Nutch

-

Yesterday the whole day was spent in trying to go through the Nutch source code. Chris and Ashish helped me out alongwith this link
Dissecting the Nutch Crawler
. This showed me that :
The file Fetcher.java has a reference to the “content” variable (which is of type Content). I found that initially only the URLs are stored during the crawl, then a request is sent. Then based on the MIME type of the content returned, the ParserFactory class creates a parser (html parser, pdf parser etc.). The code for these parsers can be found at nutch-0.6/src/plugin/. These plugins do the parsing and get the content as a “Parse” object. Using the Parse.getText() method (which we also felt was interesting) we can get the text content of any page!!!!!

0

Nutching Nutching Nutching

-

The whole day today was spent in analyzing Nutch Source Code with Anshul. It is almost 8:00 pm now and nothing has been done yet! Have received an e-mail from Chris Mattman and Ashish Vaidya giving some pointers. Hopefully, it’s gonna help!
Had problems while compiling the code as well. It’s strange that when I installed j2sdk from Java Sun site I did not get javac in the /usr/java/jre1.5.0_02/bin directory. So since I did not have enough time to look for the files, I simply downloaded Netbeans and got the thing compiled with Daddu’s suggestion!
Will blog later when I can get some success with Nutch.
-Rajat.
Rajat’s Abode