Arseam

International Journal of Advances in Engineering & Scientific Research

AN EFFICIENT DATA ANALYSIS FRAMEWORK IN BIG DATA

*Lilly Raffy Cheerotha, ** V.N.Anushya

*Assistant Professor, Department of CSE I.E.S.College of Engineering Thrissur., **AssistantProfessor, Department of CSE M.A.M. College of Engineering & Technology Trichy

DOI : Page No : 30-42

Published Online : 2015-12-30

Download Full Article : PDF Check for Updates

Abstract:

Big data refers to huge volume of data. Big data is the process of handling large datasets. In today’s scenario, data is growing exponentially faster than ever so the concept of Big data has emerged. It can perform data storage, data analysis, and data processing as well as data management techniques in parallel. Big data can process several peta bytes (1015) of data in seconds. It can handle both structured and unstructured data at a time. The aim of this project is to use the classification technique before mapping the tasks into the resources. For mapping the tasks, MapReduce programming model is used which reduces the workload on the resources. The MapReduce will take more time to decide the resource for performing the tasks which is to be allocated. Parallel Database technology is used to increase the performance of Big data because it allocate the tasks in parallel into the resources.

In this model, for classifying the tasks, Ensemble Classifier is used. An Ensemble Classifier is the group of different classifiers which make the classifiers to process in parallel and also shares the knowledge of fastest processing classifier to others. The Support Vector Machine, Decision Tree and K-Nearest Neighbor are the classifiers used to produce an Ensemble Classifier. Therefore, the data’s will be processed with minimal scheduling time (the map class will not take time to decide to which resource the task has to be allocated). Along with Ensemble Classifier, Map Reduce model and Parallel Database Technology is used which increases the efficiency and throughput of Big Data by reducing the scheduling time.

Keywords— MapReduce, Hadoop, EnsembleClassifier, Parallel Database

Article View: 232
PDF Download: 0

Submission Instruction

Call for Paper

Submit Paper Online

Join Editorial Board

Author Guidelines

Impact Factor: 6.013

Indexed in

AN EFFICIENT DATA ANALYSIS FRAMEWORK IN BIG DATA

Aims and Objectives

Ethics Policy

Peer Review Policy

Call for Paper

Conference

Instruction To Review

Guide To Authors