Top Banner
Case Study: Wireless AP Traffic Collector using Hadoop Speaker: Jyun-Yao Haung ([email protected])
18
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Case study ap log collector

Case Study: Wireless AP Traffic

Collector using Hadoop

Speaker: Jyun-Yao Haung ([email protected])

Page 2: Case study ap log collector

Hadoop is a useful package for cloud computing environment. It’s also available for Big Data computing.

In this case, we concern how to collect the real-world wireless AP traffic records and store them into Hadoop HDFS.

With this system, we can take more real-world big data analysis based on these records.

No real-world and qualified data, no efficient big data processing and analysis!

Introduction

Page 3: Case study ap log collector

Our DDWRT wireless AP is based on TP-Link WR941ND v2/v3. It just has 4 MB flash storage and 30MB memory space. Meanwhile, it doesn’t provide powerful tool such as “Rflow” for offering comprehensive traffic detail and “curl” for posting data .

In order to simplifying the tasks of each machine, we need to prepare some servers to handle the jobs.

We consider the “Enterprise Integration Application ” scenario for general purposes.

Limitation

Page 4: Case study ap log collector

We consider the three stages to collect the traffic data:

1st : Wireless AP throw-out the traffic record to the 2nd

stage server.

2nd: This server collects the records and call Avro-RPC to

the 3rd stage server.

3rd: This server puts the data to its HDFS and makes the

response the 2nd.

Optionally, the 2nd server can take some logs in order to

tracing the failed event.

Limitation (cont.)

Page 5: Case study ap log collector

The Process for Collecting Traffic Data

DDWRT Wireless Traffic

Data

JSP Server for Transmission

Data with Avro

Hadoop HDFS Server

Use Hadoop HDFS to collect DDWRT Wireless Traffic Data

[GET] Http Access Apache Avro RPC

Page 6: Case study ap log collector

The Detail of the Methods

Page 7: Case study ap log collector

Crontab

* * * * * root [ ! -f /tmp/postdata.sh ] && wget http://your_domain/postdata.sh -O /tmp/postdata.sh && chmod +x /tmp/postdata.sh

* * * * * root [ ! -f /tmp/wrtbwmon ] && wget http://your_domain/wrtbwmon -O /tmp/wrtbwmon && chmod +x /tmp/wrtbwmon

* * * * * root /tmp/wrtbwmon setup br0

*/30 0-3 * * * root /tmp/wrtbwmon update /tmp/usage.db peak

10,40 0-3 * * * root /tmp/postdata.sh

*/30,59 4-8 * * * root /tmp/wrtbwmon update /tmp/usage.db offpeak

10,40 4-8 * * * root /tmp/postdata.sh

*/30 9-23 * * * root /tmp/wrtbwmon update /tmp/usage.db peak

10,40 9-23 * * * root /tmp/postdata.sh

Wireless AP: DDWRT Settings

Page 8: Case study ap log collector

Post Data Script

#!/bin/sh

data=`cat usage.db|tr "\\n" "$"|tr " " "_"`

wget http://your_domain/ddwrt_collector.jsp?data=${data}

-O /dev/null

Wireless AP: DDWRT Settings (cont.)

Page 9: Case study ap log collector

Prepare some needed libraries for Apache-Tomcat in into <CATALINA>/libs jackson-core-asl-1.9.13.jar

jackson-mapper-asl-1.9.13.jar

avro-1.7.5.jar

avro-ipc-1.7.5.jar

Netty-3.4.0.Final.jar

slf4j-api-1.7.5.jar

Put your compiled Avro-RPC classes into <CATALINA>/libs

Receive the [GET] data from DDWRT-AP.

Call Avro-RPC to put data.

Wait for putting data in Hadoop HDFS…

JSP Server

Do NOT include avro-tools.jar

Page 10: Case study ap log collector

<body>

<%

String mes = "";

try {

String inputData = "";

if(request.getParameter("data") != null)

{

Usage req = new Usage();

inputData = request.getParameter("data");

StringTokenizer line = new StringTokenizer(inputData,"$");

while(line.hasMoreTokens())

{

NettyTransceiver client = new NettyTransceiver(

new InetSocketAddress(<your_host>,<your_port>));

NubLookup proxy = (NubLookup) SpecificRequestor.getClient(NubLookup.class, client);

String record = line.nextToken();

String[] items = record.split(",");

req.mac_addr = new Utf8(items[0]);

req.upload_peak_on = Integer.parseInt(items[1]);

req.download_peak_on = Integer.parseInt(items[2]);

req.upload_peak_off = Integer.parseInt(items[3]);

req.download_peak_off = Integer.parseInt(items[4]);

req.time = new Utf8(items[5]);

mes += "Result:" + proxy.send(req).is_ok + "<br />";

client.close();

}

}

} catch (IOException e) {

mes = e.toString();

}

%>

<h1><%=mes %></h1>

</body>

JSP Server: procedure code

Avro Serializable RPC call

Page 11: Case study ap log collector

Await the Avro RPC.

When the RPC call comes, put the data with indicated parameters of the RPC into HDFS.

You should put the needed libraries(generated source code):

Org.apache.avro.data.*

Org.apache.avro.generic.*

Org.apache.ipc.*

Hadoop HDFS Server

Page 12: Case study ap log collector

public class RPC

{

private static NettyServer server;

// A mock implementation

public static class NubLookupImpl implements NubLookup {

public Response send(Usage request) throws AvroRemoteException {

Response r = new Response();

try {

Calendar cal = Calendar.getInstance();

cal.getTime();

SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd-HH-mm-ss-S");

Path pt=new Path("ddwrt_log/" + sdf.format(cal.getTime()).toString() );

Configuration conf = new Configuration();

conf.addResource(new Path("core-site.xml"));

FileSystem fs = FileSystem.get(conf);

BufferedWriter br=new BufferedWriter(new OutputStreamWriter(fs.create(pt,true)));

String record = request.mac_addr + " " + request.upload_peak_on + " " + request.download_peak_on + " " + request.upload_peak_off + " " + request.download_peak_off + " " + request.time;

System.out.println(record);

br.write(record);

r.is_ok = true;

}

catch (Exception exp)

{

System.out.println(exp);

r.is_ok = false;

}

return r;

}

}

public static void main( String[] args )

{

server = new NettyServer(new SpecificResponder(NubLookup.class, new NubLookupImpl()),new InetSocketAddress(<your_host>,<your_port>));

server.getPort();

}

}

NettyServer with Custom Avro-RPC

Libraries

RPC Call Implementation

Page 13: Case study ap log collector

Stored Records in HDFS

Page 14: Case study ap log collector

Avro

Avro-Tools

Avro-IPC

NettyServer/Netty Project

Jackson JSON Packages

Hadoop

Apache Tomcat/JSP

Simple Logging Facade for Java (SLF4J)

Wrtbwmon

DDWRT

Maven

The more important thing: Be patient.

Reference: Needed Packages or

Software Packs

Page 15: Case study ap log collector

Thank you!

Page 16: Case study ap log collector

D-Link AP Array: The integrated software for

controlling a numbers of D-Link AP. (Each AP cost: NT

$3600)

Related Work: Current Techniques

Page 17: Case study ap log collector

RFlow

Page 18: Case study ap log collector

RFlow (cont.)