Thursday, October 29, 2009

My opinions and reactions on the class DPI project

When we first started the class DPI project, the first thing that came into my mind was how easy it would be. I mean, how difficult would it be to open packets, inspect their contents, and classify them accordingly. You could say it would be the equivalent of your friendly postman opening your mail, and classifying whether it contained something important, a postcard, some cash, or perhaps spam and even anthrax. And we're not looking for passwords, credit card information or information for the spooks. No, we are more benevolent than that.

Our purpose would be to classify applications running on the network properly. If we're going to monitor our networks, we must have a complete picture of the applications that our users are running on them. While most of these applications are "visible" and can easily be blocked. However, many applications running on the Internet have acquired the ability to bypass firewalls and proxies. Because of many corporate, academic, technical and what-not policies that have governed networks for the past few years, have pushed applications to use proxies or encrypt their communications in order to bypass the usual roadblocks that network administrators have put in place over our networks today.

The wisdom of such blocks have been in serious question, both on the technical and user levels. However, the reality is that these blocks are here to stay, and the target applications of such blocks have adapted to the current Internet landscape. A great example of such a versatile program would be Skype.

The purpose of our class project was to detect peer-to-peer traffic that have managed to pass through the roadblocks that the university network administrators have put in place.

While we did manage to get a sample of the network traces, we have yet to detect any peer-to-peer activity in the university network. So far, the university network administrators have appeared to succeed in their "quest" to block all kinds of peer-to-peer traffic.

We also used some machine learning techniques on the traces, however, I think that we have largely failed in that because we don't have any training data to use... because there have been no peer-to-peer traffic detected. We need to get data which we positively know has peer-to-peer traffic. If we can't detect it, then we should run some applications and actively look for holes in the university network. Once we "detect" our own traces, put them into the machine learning tool and use it as training data to detect the peer-to-peer traffic that do not belong to us.

The other technique that the class investigated, which is actually reading the packet contents, is either a hit-or-miss thing. We can argue that reading the first few bytes of the data can give us the name of the actual application, however, once this traffic is encrypted, all bets are off. I believe this technique will only be useful in the near-to-medium term, and will work only on simple applications that have not acquired the variety of users who need to use special methods to bypass proxies and firewalls.

We are not yet there, but we have learned the "what not to do in DPI". This may sound like an Edisonian way of thinking, but I believe that as we continue to refine our techniques and code, we will be able to achieve a way to detect peer-to-peer traffic without reading the payload.

Wednesday, August 12, 2009

rSim network simulation results

Here are the results from my network simulation.

Thursday, July 23, 2009

Thursday, July 9, 2009

Network Traces



If the slideshow is not showing, you can view it here.

Wednesday, July 8, 2009

Circumventing P2P blocks

Assumptions
  • allowed port22/SSH outgoing
  • Squid proxy on port443 and port80
  • NAT support
  • outgoing VPN allowed

SOCKS4/5 proxy

  • using ssh -D8080 root@remote-host.com
  • using proxifiers (HTTP/SOCKS) / stunnel-encrypt any TCP connection (single port service) over SSL
  • (then use as SOCKS/HTTP proxy in btclient)

Bypassing SQUID

  • HTTP CONNECT on specified FQDN peers (to bypass CONNECT to IPaddr filter). The peers are HTTP proxies.
P2P on VPN (OpenVPN, IPsec)

  • openvpn multiplexes on a single TCP/UDP port
  • IPSec, security scheme on layer3/Network layer (OSI)/Internet layer

NAT on tcp/443
  • all browser sessions use proxy

A measurement study on video acceleration service

P. Pan, Y. Cui, and B. Liu, "A measurement study on video acceleration service," in IEEE CCNC, 2009.

Relevance
  • Pipes getting bigger.
  • Bandwidth and storage getting cheaper.
  • Browsers getting smarter.
  • People getting closer / social media.
  • VoD: Youtube / Tudou / Huulu / etc rely on streaming.
Performance

Buffer
- long time to buffer / multi-connection download, P2P
  • multiple connections over the same data (e.g. TV show)
  • caching at peering points
  • TCP/UDP data transfer
  • intelligent P2P routing between peering points
- buffer may stop / auto-reconnect download session
- multiple instances of buffered data / cache sharing

Result highlights


Conclusion
  • Accelerator in the browser.
  • ISP peering/caching technology
  • Partial Net neutrality?