Analysis of economic data using big data
Post on 14-Apr-2017
83 Views
Preview:
Transcript
Analysis of Economic Data Using Bigdata
Presented BySHIVUMANJESH P
[4JC13MCA51] VI SEM MCA
SJCE
Internal Guide C J HARSHITHA Assistant Professor
Dept. Of MCA SJCE
External Guide Imran basha Senior Consultant
Snipe IT Solutions
JSS MAHAVIDYAPEETHA SRI JAYACHAMARAJENDRA COLLEGE OF ENGINEERING MYSURU-570006
AN AUTONOMOUS INSTITUTE AFFILIATED TOVISVESVARAYA TECHNOLOGICAL UNIVERSITY, BELGAVI.
Presentation on
Problem Definition1. Inflation is rising as a serious threat for countries
development.
2. Unscientific farming
3. Big-picture problem, economic indicators and decision makers rely on the native economic transactions and on the data records.
Objective
To examine economic data and recording the increasing and decreasing vegetables and food items prices year to year.
Preferring the fresh and edible food products and to overcome various problems of deficiency and malnutrition.
To maintain the continuous connectivity between Demand-Supply Chain
•
Scope of The Project• The Economic data analysis make an immense impact on E-
commerce and also builds a potential to the business activities and also in the investments
• The analysis is limited to the particular products and can be future extended based on the requirements and developments.
• The big data analysis can be presented using the android application by providing simple and smart user interfaces about products they use in the daily life
• It requires high end specification of the system on which it is implementing, dealing with large data set with diversified features and functionalities.
User characteristics
• The system will provide a very precise and simple platform to the respective users.
• The admin will provide the access to the developer as well as to the user and provides data sets.
• The developer collects the data sets clusters the data based on the particular criteria and analyze the behavior of the data elements.
• The user gets the desired result by firing a query.
General constraints• The big data usage is efficient for large data sets and it is
not suitable for data with less volume.
• Since the main objective is based on data analysis user interface section is given least priority.
• Sometimes it may find tedious to deal with complete unstructured data items.
• The data which is obtained from the various source may not be of same parameters
Functional Requirements Storage • Hadoop Distributed File System is designed for storing very large
files with streaming data access patterns, running on clusters of commodity hardware.
• The economic data is a highly diversified data set which is both large and variety in nature.
• A dataset is typically generated or copied from source, and then various analyses are performed on that dataset over time.
• Applications that require low-latency access to data, in the tens of milliseconds range, will not work well with HDFS.
Computation• MapReduce is a processing technique that allows for
massive scalability across hundreds or thousands of servers in a Hadoop cluster.
• The MapReduce algorithm contains two important tasks, namely Map and Reduce.
• This algorithm in economic data analysis helps in finding the demand for the particular goods based on certain key words.
• The shuffle and sort process is dependent mainly on volume of the data sets.
Performance Requirements• The major aim for choosing the domain of big data for
economic analysis is for the velocity criteria of data processing.
• Connecting of the commodity systems and forming the node between them helps in quick retrieval of the data items.
• There is a vast development of flexibility in distributed system environment.
• Hardware Requirements
Processor : Core i3 onwards RAM : 4GB + Hard disk space : 40GB +
• Software Requirements
Technology : Hadoop Tools : Apache Hive Apache Pig
Apache SqoopApache oozie
R Studio Operating System : Linux
System Architecture
Class Diagram
Algorithm Design
Dataflow Diagram
Level 1 DFD
Activity Diagram
Use case Diagram – Admin & User
Use case Diagram-Developer
Sequence Diagram
Requirements
Unstructured Datasets
Structured Datasets
System Implementation
R Environment
Experimental Results
Test Cases
Testcase no Testcase Discription
Required input
Expected output
Actual output
Test pass/fail
#TC 01 Verification of the nodes
Command to start Hadoop nodes (Start-
all.sh)
All nodes should start
All nodes are present
P
#TC 02 Verification of Hive Installation
Command Hive version
It should return Installed Hive
Hive Version is returned
P
#TC 03 Verification of Pig Installation
Command to start Pig (/opt/pig)
It should return grunt shell
grunt shell is returned
P
#TC 04 Verification of Sqoop
Installation
Command Sqoop version
It should return Installed Sqoop
Sqoop Version is returned
P
#TC 05 Verification of Data Imported to
HDFS from RDBMS
Entering to Hadoop file system from
local file system
Imported data should be
present in HDFS
Imported data is present in HDFS
P
#TC 06 Validating user Query
Entering Query Valid query should be entered
Valid query is entered
P
#TC 07 Testing the processed data
Post Query Processed Data should be correct
Processed Data should be valid
P
#TC 08 Importing the processed data to
R
Import Dataset Processed data should be imported
Processed Data is imported
P
#TC 09 Mapping of processed dataset
Barplot() Processed dataset should
be mapped correctly
Processed data is mapped correctly
P
#TC 10 Mapping in Pie chart
Pie() Processed data should be mapped in
percent
Processed data is not mapped with percent
F
#TC 11
Retrieving the
result Less than 5 seconds
Dump()
Result should be displayed
within 5 seconds
Results is displaying more than 5 seconds
F
#TC 12
Plotting the
values obtained in R
Plot()
All the values
should be obtained
Some values are missing
F
Conclusion• The statistical analysis is carried out for fruits and
vegetables from the 1970-2013
• The major Requirements is based on the context of inflation problem
• The analysis is done mainly on product based and Year based
• This analysis serves as a vital input for machine learning mechanism
Future Enhancements• The analysis can be extended further on the food grains
• The enterprise application can be build by embedding a search engine which will be helpful for end user
• The data sets can be tuned which may leads in deriving of some other requirements of different paradigm
• The graphical representation can be changed further by displaying of accurate value rather than range
Company Details
Company Name : Snipe IT Solutions
Address : # 123, 3rd floor, 70th Cross, 5th Block, Rajajinagar Nagar, Bengaluru.
External guide : Imran basha Senior Consultant
Snipe IT Solutions
Email : mkimranbasha@gmail.com Ph no : 9590071811
Thank You
top related