Jude's Views, Programming, Politics and everything: January 2011

Wednesday, January 26, 2011

Buying time?

The nifty trades around 5663 today, reflecting the P/E of sub 22. Analysis shows that the market had traded below 22 only on 30% of the days last year. So, based on 2010 data, the Nifty is correcting from expensive region.

Is it cheap enough at current levels? Are market headed further south? I do not know. But, I think it makes sense to start accumulating from these levels.

Monday, January 24, 2011

Packet Length Distribution in LAN - II

After running the experiment for about 10 days, the table looks like this...

Pkt Histogram
==============
Length %of total packets
=========================
0-63 21.28
64-127 0.69
128-191 5.43
192-255 0.83
256-319 0.56
320-383 0.03
384-447 0.04
448-511 0.06
512-575 0.07
576-639 0.07
640-703 0.05
704-767 0.05
768-831 0.05
832-895 0.03
896-959 0.04
960-1023 0.03
1024-1087 0.03
1088-1151 0.04
1152-1215 0.04
1216-1279 0.1
1280-1343 0.02
1344-1407 1.42
1408-1471 0.02
1472-1535 69.01

Sunday, January 23, 2011

Watson, What's next? - Ultimate knowledge economy.

IBM has built a new super computer program called Watson and it is capable of playing Jeopardy quiz game against humans and winning them hands down. This amazing super computer is a result of deepQA project which "understands" human natural language and answers the question from its knowledge built again by processing millions of articles written in natural language. So, this is how the Watson plays the game. Just like other human contestants, it listens to the "answer" on select topic, digitizes the voice, process and understand the hint. Searches its knowledge base, comes up with correct "question" before others and presses the button. Then when asked, Watson would 'speak out' the question in natural language. If the "question" was correct contestants get points else they get to loose points.

In other words, computer program can read and understand any printed book, news paper archive and correlate events hence build its 'knowledge'. Here is an interesting video by lead scientist David Ferrucci of deepQA project. What interested me at first is that Watson is improving its answering accuracy since three years and now it can beat any grandmasters of Jeopardy game. Another piece of detail is Watson is built by networking what seems to be server-grade computers. It runs 3000 Cores simultaneously. It is NOT a keyword matching search engine.

Ok, now let me speculate little bit here what might happen next:

1) Immediate: Dawn of cog-engine
IBM might roll out its next version of "search engine" whose results are far better than Google's. They might probably call it "Cog Engine" than search engine. Who knows, Google might already be working on something similar already. It is not IBM or google, point here is "search" as we know it today might change dramatically.

2)Immediate to near future: Worthless "Research"
"Research" would loose its meaning. Anyone who has access to Watson type programs could produce the results in a day or two what today's researcher would produce with years of effort. So, it is imperative that either human researcher find something really complex to research on OR they research on improving watson algo itself!

3) Immediate to near future: Hand-held Watson
You would be able to conduct the research/learning/QA from your mobile devices. Mobile phones would be so powerful that they could easily do the "knowledge retrival" part. "Knowledge gathering" part may be done by relatively powerful servers on the cloud.

4) Far future: Download F-16 flying skill
Some one will eventually figure out how human brain analyses the natural language and makes decision vis-a-vis Watson. Watson is made to mimic the brain in the first place any way. Once that is done, the "knowledge" can be freely moved across machines and humans. Think about it, computers can learn a subject and move the "knowledge" to humans' brains and vice versa.

5) Digitize the skill:
Logical next step would be to digitize the "skill set". Till today, humans acquire knowledge individually. That means, One has to learn a subject by reading books, practising the art etc basically by putting in time and effort. He or she eventually becomes expert in the subject. Then He/She dies. Only way to pass on the expertise or knowledge to other people is by writing a book or articles or by giving speech and so on. So, there is no "continuity" in the knowledge gaining process. Now, if one can digitize and pass on the "expertise" to others as he/she left it without needing to write a book and others to read and work on it to get to that point.
Imagine you could download Linus Torvalds' operating system knowledge and Larry Page/Sergey Brin's analytical knowledge and Steve Job's product finish and marketing right into your brain!
No one would ever need to write a book, article, papers, exams and so on. you could directly submit your brain map for evaluation when you graduate out of college! Implications are profound.

6) Cog-market: sell your skill for the stay at ski resort
Imagine you have knowledge market, where various skill sets in various maturity level gets traded like stocks in today's stock/commodity market. one could learn any language in a jiffy. There could be some rating agency that grades the quality people's knowledge. So, people will be become Hardware that constantly work and improve the humanity's software - knowledge system that is.

7) Far into future: The Big Leap
We humans, have taken 10,000 years since we organized ourselves as civilization to develop steam engine (invention in mid of 18th century)and hence industrial revolution. It just took us less than 50years since we got our practical computer to get to where we are now. Previously, it would take decades to conceptualizing a product , developing a product to seeing the product in market. Now-a-days, it is the matter of years some time within a year.
Rightly, we are already on exponential growth trajectory in improving our collective cognizance. This is primarily because we are building stuff on the knowledge we have gained so far. The predictions which I have dared to make, if comes true... if at all, then the exponential growth is not just going to grow faster. It is going to be a complete game changer. our growth path would be hyper-exponential or hyper-hyperbolic.

The exponential growth seems very impressive given where we started from. But in absolute terms, humanity is still sub Type-1 Kardashev scale. We are at 0.7 in kardashev scale while aliens/extra terrestrials are expected to be of Type-II or even Type-III civilizations. Ultimately this hyper-hyperbolic growth would help propel our civilization from the Type-I into Type-II.

I would think it would all or most of it happen in next 3, 4 or 5 decades of time. We are going to live in interesting times. The Chinese curse, "May you live in interesting times" is more truer for this generation than anybody else.

Monday, January 17, 2011

mini Howto: write your own iptables target module

Howto Extend the iptables

I searched the existing iptables target/match modules whether they would help me collect the packet length statistics of LAN. I set out to use iptables infrastructure because iptables offered a nice packet management framework. I did not find a readily available target so, I decided to write my own.
What i found was it is surprisingly easy to write your own target modules.

Design:
My target module should inspect the length of the packets it receives. Iptables rules can be setup to send packets from specific source/destination/port number/application/time of day etc to the newly written module. The module, builds a array of 32 integers, which is modifiable at compile-time. The array elements act as a packet counter. array[0] would count the packets whose length varies from 0-63 and so on. My sampling step size is 64, which is again programmable at compile-time.

I used proc entry in order to access these counters from user space. For time being I have avoided netlink socket(elegant solution) and kprintf(beginner solution) both would work.

The module has entry and exit functions. Entry function creates proc entry and registers a target function. Write the target function itself and fill up the xt_target structure. Use this structure to register with iptables as a new target. Exit function de-registers the target and removes the proc entry.

The target function gets the skb pointer to work on. Yes, now we can do anything with the packet. The target function gets called by iptables with the packet (skb) as an argument and target module's argument as typed in iptables command as second argument. To keep things simple, I have decided to have my module without second argument. so, it just receives the packet.I get the packet length and quantize the length into 32 steps, and increase the appropriate counter.

To export the constructed histogram, I have written the output function which is invoked whenever user attempts to read the /proc/pkt-hist file. This function just prints out the array.

PS: Extending the iptables involves writing actual target or match modules which does the job in kernel space AND writing supporting user space library to parse the arguments and let the iptables know what to do when a target is typed in the command line for e.g
iptables -I FORWARD -s 192.168.x.y/32 -d 0.0.0.0/0 -j PHIST

Steps:
1) Make sure you have latest kernel code downloaded and netfilter option enabled. I had linux-2.6.32. Also make sure the /usr/src/linux points to /usr/src/linux-2.6.32/ where i have the linux source is available.

2) Get the iptables source code and untar it. Do ./configure;make;make install - usual procedure. I had iptables-1.4.3.2.

3) Implement the target: easiest way to write your own target is to take a reference from existing target modules. go into /usr/src/linux/net/netfilter/. You will find all the match and target modules supported by that netfilter(kernel). All match modules are in named in small letters while targets are written in caps. for eg, xt_mark.c is a match module, while xt_MARK.c is a target module.
I have copied the xt_NOTRACK.c which is a very simple target module and fits my bill. I have removed all code inside the target function and written my own. Renamed all the functions, structures appropriately. In entry function i have created /proc/pkt-hist file. I have also added required call back code for /proc/pkt-hist.

4) Modify KConfig: Now, in /usr/src/linux/net/netfilter/ open Kconfig in vi editor. search for NOTRACK, you will find
" config NETFILTER_XT_TARGET_NOTRACK
tristate '"NOTRACK" target support'
depends on IP_NF_RAW || IP6_NF_RAW
depends ....
"
If you recollect, this is what appears in "make menuconfig" program. So, copy-paste this entire section into Kconfig as below:
config NETFILTER_XT_TARGET_PHIST
tristate '"PHIST" target support'
depends on NF_CONNTRACK
help
The PHIST target collects the packet distribution
in pre-set steps.

If you want to compile it as a module, say M here and read
. If unsure, say `N'.

5) Implement user space library: go into /usr/src/iptables-1.4.3.2/extensions
and you will find libxt_NOTRACK.c, clone the same into libxt_PHIST.c and modify the names of functions appropriately.

6) compile and fly
Compile Target:
In /usr/src/linux/net/netfilter/, edit Makefile. Search for NOTRACK. you will find the following
obj-$(CONFIG_NETFILTER_XT_TARGET_NOTRACK) += xt_NOTRACK.o
Now, clone this line for PHIST. see below:
obj-$(CONFIG_NETFILTER_XT_TARGET_PHIST) += xt_PHIST.o

Now, At /usr/src/linux/ do make modules;make module_install. you will find xt_PHIST.ko in /usr/src/linux/net/netfilter/
Compile iptables userspace: repeat step (2). you will find libxt_PHIST.so in /usr/src/iptables-1.4.3.2/extensions/

7) Now try typing
iptables -I FORWARD -s 192.168.x.y/32 -d 0.0.0.0/0 -j PHIST
Now, iptables finds the xt_PHIST.ko and loads it and send the packet satifsying the rule to the target code which is just written.

So, the target module worked like a charm may be due to it is not doing much. But it was a breeze to write a new target. I guess writing match module wont be entirely different from what is described here.

Sunday, January 16, 2011

Packet Length Distribution in LAN

I this post, I have copy-pasted from internet (ofcourse, citing the author) the packet distribution in internet. I always wanted to do that experiment myself. Last week, finally i could do just that. This is just a first experiment where I have captured all my own packets going into LAN, not purely internet.

Setup was similar to what i have done in my previous post. Replace "LAN cloud" with my PC and "wan cloud" with my office servers and internet. Linux box is a dual NIC-PC which runs FC10. I have configured the "Linux box" as a bridge. I wrote a iptables target module and passed all packets emerging from my PC through the module.

The Target module I wrote, looked and sampled the packet length of all packetes it receives and built a histogram. The histogram was built with the step size of 64bytes. After running it for couple of days, the result looks like below:

Pkt Histogram
==============
Length %of total packets
=========================
[ 0- 63]=18.91%
[ 64- 127]=0.84%
[ 128- 191]=5.76%
[ 192- 255]=0.47%
[ 256- 319]=0.8%
[ 320- 383]=0.05%
[ 384- 447]=0.08%
[ 448- 511]=0.09%
[ 512- 575]=0.13%
[ 576- 639]=0.17%
[ 640- 703]=0.09%
[ 704- 767]=0.06%
[ 768- 831]=0.05%
[ 832- 895]=0.03%
[ 896- 959]=0.03%
[ 960-1023]=0.03%
[1024-1087]=0.04%
[1088-1151]=0.04%
[1152-1215]=0.04%
[1216-1279]=0.2%
[1280-1343]=0.02%
[1344-1407]=0.62%
[1408-1471]=0.03%
[1472-1535]=71.44%
[1536-1599]= 0%
[1600-1663]= 0%
[1664-1727]= 0%
[1728-1791]= 0%
[1792-1855]= 0%
[1856-1919]= 0%
[1920-1983]= 0%
[1984-2047]= 0%

Now, I am tempted to the same exercise on the packet flowing purely into internet from my office. I will do the same and report in coming days. so, keep visiting this page or just subscribe to the RSS feed of my page.

PS:
My Work-PC is also a Linux machine running FC13. The packet captured reflects regular web browsing -http/https, email, and NFS (my home is mounted on nfs) and NIS.

Coming up next: How to write a iptables target module... Stay tuned.

Jude's Views, Programming, Politics and everything