24/08/2017 1 Advanced Java Programming Course By Võ Văn Hải Faculty of Information Technologies Industrial University of Ho Chi Minh City Big Data - MongoDB Session objectives Big Data Overview NoSQL introduction MongoDB introduction MongoDB – Java Programming 2 3 Big Data, the market value 4
12
Embed
Session objectives Big Data - MongoDB · PDF file03.08.2017 · Big Data - MongoDB Session objectives Big Data Overview NoSQL introduction MongoDB introduction MongoDB –Java...
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
24/08/2017
1
Advanced Java Programming Course
By Võ Văn Hải
Faculty of Information Technologies
Industrial University of Ho Chi Minh City
Big Data - MongoDBSession objectives
Big Data Overview
NoSQL introduction
MongoDB introduction
MongoDB – Java Programming
2
3
Big Data, the market value
4
24/08/2017
2
Data Management Systems: History
• In the last decades RDBMS have been successful in solving
problems related to storing, serving and processing data.
• RDBMS are adopted for:
o Online transaction processing (OLTP),
o Online analytical processing (OLAP).
• Vendors such as Oracle, Vertica, Teradata, Microsoft and IBM
proposed their solution based on Relational Math and SQL.
But….
5
Something Changed!
• Traditionally there were transaction recording (OLTP) and
analytics (OLAP) of the recorded data.
• Not much was done to understand:
o the reasons behind transactions,
o what factor contributed to business, and
o what factor could drive the customer’s behavior.
• Pursuing such initiatives requires working with a large amount of
varied data.
6
Something Changed!
• This approach was pioneered by Google, Amazon, Yahoo, Facebook
and LinkedIn.
• They work with different type of data, often semi or un-
structured.
• And they have to store, serve and process huge amount of data.
7
Something Changed!
• RDBMS can somehow deal with this aspects, but they have issues
related to:
o expensive licensing,
o requiring complex application logic,
o Dealing with evolving data models
• There were a need for systems that could:
o work with different kind of data format,
o Do not require strict schema,
o and are easily scalable.
8
24/08/2017
3
Evolutions in Data Management
• As part of innovation in data management system, several new
technologies where built:
o 2003 - Google File System,
o 2004 - MapReduce,
o 2006 - BigTable,
o 2007 - Amazon DynamoDB
o 2012 - Google Cloud Engine
• Each solved different use cases and had a different set of
assumptions.
• All these mark the beginning of a different way of thinking
about data management.
9
Hello, Big Data!
Go to hell RDBMS!
10
Definition
“Big data is a term for data sets that are so large or complex that
traditional data processing application software is inadequate to
deal with them. Big data challenges include capturing data, data
storage, data analysis, search, sharing, transfer, visualization,
querying, updating and information privacy.”
(https://en.wikipedia.org/wiki/Big_data )
11
Characteristics
• Volume
o The quantity of generated and stored data. The size of the data determines the
value and potential insight- and whether it can actually be considered big data or
not.
• Variety
o The type and nature of the data. This helps people who analyze it to effectively use
the resulting insight.
• Velocity
o In this context, the speed at which the data is generated and processed to meet
the demands and challenges that lie in the path of growth and development.
• Variability
o Inconsistency of the data set can hamper processes to handle and manage it.
• Veracity
o The quality of captured data can vary greatly, affecting the accurate analysis.