Tuesday, June 30, 2009

PBS: Periodic Behavioral Spectrum of P2P Applications

Tom Z.J. , Yan Hu, Xingang Shi, Dah Ming Chiu, and C. S. Lui
- authors


Introduction

This paper discusses about a new approach in identifying P2P traffic by profiling specific traffic patterns that is introduced by the P2P overlay in the network. from this profiles we can show the "periodic behavior" of the overlay and this behaviors can help us identify the system running on the network without the use of port monitoring and inspecting the payload of certain traffic.

The paper introduces a novel approach, the Two-Phase Tranformation approach

Experiment Design

The research distinguishes 2 kinds of periodic group communication
1. control plane- control signals for the overlay
2. data plane - actual data flows in the overlay network

The resarch also identified three (3) major types of periodic behavior or pattern
1. Buffermap exchange
  • typical on P2P streaming
  • peers exchange buffer information periodically using buffer maps
2. Content flow control
  • mechanic for limiting download rate of peers
  • introduces periodic data flows
3. Synchronized Link Activation and Deactivation
  • used in Bittorrent
Methodology

  • PBS pattern identification done on a selected PC on the network
  • packet detection using wireshark
Two-phase tranformation
1. capture inbound and outbound packets
2. graph packet traffic on a timeline
3. Auto-correlation of the timeline
4. Discrete Fourier Transform

Results
  • PBS profiles for a majority of P2PTV clients such as TVAnts, Sopcast, PPStream, eMule, Joost, PPMate, PPLive, TVKoo and UUSee
  • Tested using 2 scenarios: computer inside the LAN and computer accessing thru DSL connection
  • Tested the usefullness of PBS profiles by capturing traffic for two days in the camppus gateway. Results identified running P2P traffic with 100% accuracy
Critique
  • Testing for identifying P2P traffic using PBS not sufficient in terms of number of experiments
  • PBS profiles were generated using traffic inbound and outbound of a certain node, not gateway traffic. This could introduce innacuracy on PBS profiles
  • Packet header confirmation still needed.

Thursday, June 25, 2009

State of the Art in Traffic Classification: A Research Paper

M. Zhang, W. John, k. claffy, and N. Brownlee, "State of the art in traffic classification: A research review," PAM Student Workshop, 2009.


They surveyed 64 papers with over 80 data sets to create a structured taxonomy of traffic classification papers. The taxonomy is based on the following definition of traffic classification:

"Methods of classifying traffic data sets based on features passively observed in the traffic, according to specific classification goals."

They grouped the papers into 5 categories: analysis, surveys, tools, methodology and others. They used the 5 attributes (in bold) from the definition to categorize the paper.

Data sets:
- can be classified based on what type of traffic is, where it was collected, etc.

Classification goals:
- can be coarse grained(p2p, transaction oriented) or fine grained (from a specific application)

Methods:
- exact method (via port numbers)
- heuristics (based on patterns)
- machine learning methods: supervised or unsupervised learning

Features:
- choosing features to use for traffic classification is related to trends in application development. A good example given in the paper is the trend of modern applications to use UDP instead of TCP and to change ports from time to time. Because of this, mere examination of port numbers may not be enough and we might need to look at payload, flows, etc.

Using the taxonomy that they developed they tried to answer the following question: How much of modern Internet is P2P?

The following are the observations they have gathered from the papers they've surveyed:
- 1.2% to 93% of the traffic are due to P2P file sharing (observed range from 18/64 papers)
- the fractions have increased from 2002 to 2006
- P2P is more popular in Europe
- P2P traffic varies by time of day with higher percentages at night
- P2P is used more at home than in the office

Based on this, they can't have conclusive claims to answer the question above. All they can say is,

”there is a wide range of P2P traffic on Internet links; see your specific link of interest and classification technique you trust for more details.”

Shortcomings of current traffic classification:
- lack of shared current data sets
- lack of standardized measure and classification

Wednesday, June 24, 2009

Trends and Differences in Connection-behavior within Classes of Internet Backbone Traffic

Authors: W John, S Tafvelin, T Olovsson

The focus of the paper are on three main traffic classes:
  • P2P file-sharing protocols
  • Web traffic
  • Malicious and attack traffic
Results reported:
  • P2P and HTTP traffic exhibit different peak times
    • HTTP traffic has its main activities during office hours
    • P2P traffic during the night, up to 90% of transfer volumes
  • SACK option has been deployed mostly on clients, but
    • Web servers neglect its usage
    • Most P2P hosts use it
  • Malicious attacks continue all day without rest
    • Remains constant, even when the traffic volume has increased
Critiques:
  • Basis for choosing the three main classes
  • What about real time applications? or have they been lumped together with HTTP and/or P2P categories?