Top Banner
facebook Powered By : M.Bahmani H.Adldoost S.Entekhabi Urmia University April 2011
17

Facebook (stylized facebook) is a Social Networking System and website launched in February 2004, operated and privately owned by Facebook, Inc. As.

Dec 27, 2015

Download

Documents

Jewel Dickerson
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Facebook (stylized facebook) is a Social Networking System and website launched in February 2004, operated and privately owned by Facebook, Inc. As.

facebook

Powered By:

M.BahmaniH.AdldoostS.Entekhabi

Urmia UniversityApril 2011

Page 2: Facebook (stylized facebook) is a Social Networking System and website launched in February 2004, operated and privately owned by Facebook, Inc. As.

What is the

facebook?

Page 3: Facebook (stylized facebook) is a Social Networking System and website launched in February 2004, operated and privately owned by Facebook, Inc. As.

Facebook (stylized facebook) is a Social Networking System and

website launched in February 2004, operated and privately

owned by Facebook, Inc.

As of January 2011, Facebook has more than 600 million active users.  Users may create a

personal profile, add other users as friends, and exchange

messages, including automatic notifications when they update

their profile . . .

Page 4: Facebook (stylized facebook) is a Social Networking System and website launched in February 2004, operated and privately owned by Facebook, Inc. As.

Facebook has the First largest installation of

HADOOP(?) 1 Year ago , Yahoo Was the first

and now it’s the second. It is also the creator of HIVE(?)

But, What is The Hadoop and Hive?

Page 5: Facebook (stylized facebook) is a Social Networking System and website launched in February 2004, operated and privately owned by Facebook, Inc. As.

Apache Hadoop is a software framework that supports data-intensive distributed applications under a free license. It enables applications to work with thousands of nodes and petabytes of data. Hadoop was inspired by Google's MapReduce and Google File System (GFS) papers.Hadoop is a top-level Apache project being built and used by a global community of contributors using the Java programming language. Yahoo! has been the largest contributor to the project, and uses Hadoop extensively across its businesses.

HADOOP:

HADOOP SEMINAR – 2010

Page 6: Facebook (stylized facebook) is a Social Networking System and website launched in February 2004, operated and privately owned by Facebook, Inc. As.

In The Other words: Hadoop is an open source framework that enables data-intensive distributed applications to efficiently process gigantic amounts of data.   It’s an open source implementation of the MapReduce approach to processing data.  MapReduce was invented at Google to deal with the massive quantities of data necessary to index the web.  There are two main components to the system: the Hadoop Distributed File System (HDFS) which stores and maintains data across many machines, and the MapReduce engine which processes the data.

MAP REDUCE:

Page 7: Facebook (stylized facebook) is a Social Networking System and website launched in February 2004, operated and privately owned by Facebook, Inc. As.

Hadoop was created by Doug Cutting, who named it after his son's toy elephant. It was originally developed to support distribution for the Nutch search engine project.

Page 8: Facebook (stylized facebook) is a Social Networking System and website launched in February 2004, operated and privately owned by Facebook, Inc. As.

THERE ARE MANY TECHNOLOGIES BUILT ON TOP OF HADOOP THAT NEED TO BE CONSIDERED FOR YOUR SYSTEM. FOR EXAMPLE:

THERE IS:

HiveA SYSTEM FOR OFFLINE ANALYSIS

Page 9: Facebook (stylized facebook) is a Social Networking System and website launched in February 2004, operated and privately owned by Facebook, Inc. As.

Hive is a data warehouse infrastructure built on top of Hadoop. It provides tools to enable easy data “ETL”, a mechanism to put structures on the data, and the capability to querying and analysis of large data sets stored in Hadoop files. Hive defines a simple SQL-like query language, called QL, that enables users familiar with SQL to query the data. At the same time, this language also allows programmers who are familiar with the MapReduce framework to be able to plug in their custom mappers and reducers to perform more sophisticated analysis that may not be supported by the built-in capabilities of the language.

Hive:So what is the

Page 10: Facebook (stylized facebook) is a Social Networking System and website launched in February 2004, operated and privately owned by Facebook, Inc. As.
Page 11: Facebook (stylized facebook) is a Social Networking System and website launched in February 2004, operated and privately owned by Facebook, Inc. As.
Page 12: Facebook (stylized facebook) is a Social Networking System and website launched in February 2004, operated and privately owned by Facebook, Inc. As.

MySQL Federated Storage Engine:

The MySQL Federated storage engine for the MySQL relational database management system is a storage engine which allows a user to create a table that is a local representation of a foreign (remote) table. It utilizes the MySQL client library API as a data transport, treating the remote data source the same way other storage engines treat local data sources whether they be MYD files (MyISAM), memory (Cluster, Heap), or tablespace (InnoDB) . Each Federated table that is defined there is one .frm (data definition file containing information such as the URL of the data source). The actual data can exist on a local or remote MySQL instance.

Page 13: Facebook (stylized facebook) is a Social Networking System and website launched in February 2004, operated and privately owned by Facebook, Inc. As.

Oracle RAC:

Oracle Real Application Clusters (RAC) is an option for the Oracle Database software produced by Oracle Corporation and introduced in 2001 with Oracle9i that provides software for clustering and high availability in Oracle database environments. Oracle RAC allows multiple computers to run Oracle RDBMS software simultaneously while accessing a single database, thus providing a clustered database.

Page 14: Facebook (stylized facebook) is a Social Networking System and website launched in February 2004, operated and privately owned by Facebook, Inc. As.

An example of Hadoop/Hive Advantages , that has been told by a facebook stakeholder :

When we started at Facebook in 2007 all of the data processing infrastructure was built around a data warehouse built using a commercial RDBMS. The data that we were generating was growing very fast – as an example we grew from a 15TB data set in 2007 to a 2PB data set today. The infrastructure at that time was so inadequate that some daily data processing jobs were taking more than a day to process and the situation was just getting worse with every passing day!

FACE BOOK SERVERS

Page 15: Facebook (stylized facebook) is a Social Networking System and website launched in February 2004, operated and privately owned by Facebook, Inc. As.

We had an urgent need for infrastructure that could scale along with our data and it was at that time we then started exploring Hadoop as a way to address our scaling needs.[The] Hive/Hadoop cluster at Facebook stores more than 2PB of uncompressed data and routinely loads 15 TB of data daily

Page 17: Facebook (stylized facebook) is a Social Networking System and website launched in February 2004, operated and privately owned by Facebook, Inc. As.

Thanks a lot for your attention