Map Reduce Research Papers

Abstract—Cloud Platforms allow programmers to write ap-plications that run in the cloud, or use services from the Cloud, or both while abstracting the essence of scalability and distributed processing. With the emergence of Clouds as a... more

Bookmark
Download
- by Jeffrey John
- •
- 2
  Map Reduce, Cloud interoperability

We present a virtualized setup of a Hadoop cluster that provides greater computing capacity with lesser resources, since a virtualized cluster requires fewer physical machines. The master node of the cluster is set up on a physical... more

Bookmark
Download
- by Shrisha Rao
- •
- 13
  Virtualization, Scheduling, Parallel Processing, Cloud Computing

Hadoop é o principal framework usado para processar e gerenciar grandes quantidades de dados. Qualquer pessoa que trabalhe com programação ou ciência de dados deve se familiarizar com a plataforma.

Bookmark
Download
- by Fernando Anselmo
- •
- 6
  Hadoop, Data Science, Map Reduce, Big Data

Increasing performance and decreasing cost of microprocessors are making it feasible to move more processing power to the data source. This allows us to investigate new methods of storage delivery and storage management that were not... more

Bookmark
Download
- by Parkavi A
- •
- 4
  Data Mining, Map Reduce, Datacenters, Anonymization

During the recent years, a number of efficient and scalable frequent itemset mining algorithms for big data analytics have been proposed by many researchers. Initially, MapReduce-based frequent itemset mining algorithms on Hadoop cluster... more

Bookmark
Download
- by Boura Christina
- •
- 24
  Concept Map, Concept Mapping, Map use, Knowledge Map

Bookmark
Download
- by Roy Tapera
- •
- 5
  Urban Planning, Environmental Sustainability, Map Reduce, Reuse

Advancement in information and communication technology (ICT) has given rise to explosion of data in every field of operations. Working with the enormous volume of data (or Big Data, as it is popularly known as) for extraction of useful... more

Bookmark
Download
- by Jaydip Sen and +1
  Sanjib Biswas
- •
- 13
  Computer Architecture, Information Security, Supply Chain Management, Trusted Computing

The last days, the data and internet are become increasingly growing which occurring the problems in big-data. For these problems, there are many software frameworks used to increase the performance of the distributed system. This... more

Bookmark
Download
- by Rizgar R . Zebari and +3
  Subhi R M Zeebaree
  Hanan M . Shukur
  Lailan M Haji
- •
- 8
  Clouds, Distributed System, Cloud Computing, Hadoop

In the existing search results the search engine users search there results for different goals. It will be very useful for improving search engine relevance and user experience. In existing system there are so many Data mining algorithm... more

Bookmark
Download
- by IJCSMC Journal
- •
- 5
  Computer Science, Algorithms, Computer Security, Map Reduce

Big Data konsepti ve bu konsept için kullanılan genel araçlardan bahsettim. NoSQL konusunu ayrıca işlediğim için burada o konuya değinmedim.

Bookmark
Download
- by Ferhat SARIKAYA
- •
- 12
  Hadoop, Map Reduce, Big Data, Hive

Este trabalho apresenta um sistema distribuído construído para realizar o armazenamento e a indexação dos dados resultantes de uma análise forense em dispositivo de armazenamento computacional. Com o passar dos anos, a quantidade de dados... more

Bookmark
Download
- by Romualdo A PereiraJr
- •
- 4
  Computer Forensics, Hadoop, Map Reduce, Indexing

Map-Reduce is a programming model and an associated implementation for processing and generating large data sets. This model has a single point of failure: the master, who coordinates the work in a cluster. On the contrary, wireless... more

Bookmark
Download
- by Alexandros Gazis and +1
  Eli Katsiri
- •
- 20
  Fault Tolerant Computing, Distributed Computing, Fault Tolerant Systems, Middleware

This paper presents the design, implementation, and operation of a novel distributed fault-tolerant middleware. It uses interconnected WSNs that implement the Map-Reduce paradigm, consisting of several low-cost and low-power... more

Weather is the most critical for human in many aspects of life. The study and knowledge of how weather Temperature evolves over time in some location or country in the world can be beneficial for several purposes. Processing, Collecting... more

Bookmark
Download
- by IJCSMC Journal
- •
- 13
  Computer Science, Algorithms, Information Technology, Technology

Internet becomes the most popular surfing environment which increases the service oriented data size. As the data size grows, finding and retrieving the most similar data from the large volume of data would become more difficult task.... more

Bookmark
Download
- by IAEME Publication
- •
- 5
  Semantic similarity, Partitions, Map Reduce, Clustering

Data mining is the process of analyzing data from different perspectives and summarizing it into useful information the patterns, associations, or relationships among all this data can provide information. Spatial Data Mining (SDM) is the... more

Data is being produced by the firms in ever increasing rates and firms are finding new ways to make use of data to create business value. The generated volumes of data create the need for better and cheaper storage options that allows... more

Bookmark
Download
- by Hussein Negm
- •
- 7
  Data Warehousing, Hadoop, Data Modelling, Map Reduce

The major challenge of Big Data is extracting the useful information from the terabytes of raw data and analyzing the extracted information which is essential for the decision making. The above factor can be found in the proposed system... more

Internet becomes the most popular surfing environment which increases the service oriented data size. As the data size grows, finding and retrieving the most similar data from the large volume of data would become more difficult task.... more

Bookmark
Download
- by IAEME Publication
- •
- 5
  Semantic similarity, Partitions, Map Reduce, Clustering

Existing studies in data mining focus on Outlier detection on data with single clustering algorithm mostly. There are lots of Clustering methods available in data mining. The values or objects that are similar to each other are organized... more

Bookmark
Download
- by IJSRD Journal
- •
- 5
  Data Mining, Map Reduce, Clustering, Clustering Algorithm in Manet

In recent days generating data transfer become faster than ever. Need to maintain huge datasets, systems are increasingly interconnected. The Big data is the collection of large data sets, like billions of billion data stored and accessed... more

The HDFS (Hadoop Distrubuted File System) and Hadoop components are often preferred amonglarge data sets. Nowadays, we see that the big data relationship is more popular in softwareengineering, food engineering, hardware engineering,... more

Bookmark
Download
- by Muhammed ÇETİN
- •
- 6
  Taxonomy, Plant Taxonomy, Map Reduce, Coğrafya

Map Reduce is a practical model used for processing the large scale data that is the huge volume data at a very high speed. It is parallel processing programming model helping in achieving near real time results. Designed efficiently by... more

Cloud computing has emerged as a model that harnesses massive capacities of data centers to host services in a cost-effective manner. MapReduce has been widely used as a Big Data processing platform, proposed by Google in 2004 and has... more

Bookmark
Download
- by Amin Shouraki and +1
  Hadi Yazdanpanah
- •
- 3
  Cloud Computing, Map Reduce, Scheduling Algorithms

Data-intensive systems encompass terabytes to petabytes of data. Such systems require massive storage and intensive computa- tional power in order to execute complex queries and generate timely results. Further,... more

Bookmark
Download
- by Jawwad Shamsi and +1
  Muhammad Khojaye
- •
- 15
  Scheduling, Data Intensive Scalable Computing, Parallel Processing, Cloud Computing

Opinion Mining (OM) is a field of Natural Language Processing (NLP) that aims to capture human sentiment in the given text. With the ever-spreading of online purchasing websites, micro-blogging sites, and social media platforms, OM in... more

Opinion Mining (OM) is a field of Natural Language Processing (NLP) that aims to capture human sentiment in the given text. With the ever-spreading of online purchasing websites, micro-blogging sites, and social media platforms, OM in online social media platforms has picked the interest of thousands of scientific researchers. Because the reviews, tweets and blogs acquired from these social media networks, act as a significant source for enhancing the decision making process. The obtained textual data (reviews, tweets, or blogs) are classified into three different class labels which are negative, neutral and positive for analyzing and extracting relevant information from the given dataset. In this contribution, we introduce an innovative MapReduce improved weighted ID3 decision tree classification approach for OM, which consists mainly of three aspects: Firstly We have used several feature extractors to efficiently detect and capture the relevant data from the given tweets, including N-grams or character-level, Bag-Of-Words, word embedding (GloVe, Word2Vec), FastText, and TF-IDF. Secondly, we have applied a multiple feature selector to reduce the high feature's dimensionality, including Chi-square, Gain Ratio, Information Gain, and Gini Index. Finally, we have employed the obtained features to carry out the classification task using an improved ID3 decision tree classifier, which aims to calculate the weighted information gain instead of information gain used in traditional ID3. In other words, to measure the weighted information gain for the current conditioned feature, we follow two steps: First, we compute the weighted correlation function of the current conditioned feature. Second, we multiply the obtained weighted correlation function by the information gain of this current conditioned feature. This work is implemented in a distributed environment using the Hadoop framework, with its programming framework MapReduce and its distributed file system HDFS. Its primary goal is to enhance the performance of a well-known ID3 classifier in terms of accuracy, execution time, and ability to handle the massive datasets. We have carried out several experiences that aims to assess the effectiveness of our suggested classifier compared to some other contributions chosen from the literature. The experimental results demonstrated that our ID3 classifier works better on COVID-19_Sentiments dataset than other classifiers in terms of Recall (85.72 %), specificity (86.51 %), error rate (11.18 %), false-positive rate (13.49 %), execution time (15.95s), kappa statistic (87.69 %), F1-score (85.54 %), classification rate (88.82 %), false-negative rate (14.28 %), precision rate (86.67 %), convergence (it convergent towards the iteration 90), stability (it is more stable with mean deviation standard equal to 0.12 %), and complexity (it requires much lower time and space computational complexity). INDEX TERMS ID3 decision tree, opinion mining, Hadoop, HDFS, MapReduce, feature extractors, feature selectors, DataMining, big data, information gain.

Bookmark
Download
- by Junaid Qadir and +1
  fatima es-sabery
- •
- 5
  Machine Learning, Map Reduce, Deep Learning, Tweets

The term 'Big Data', refers to data sets whose size, complexity, and growth rate make them difficult to capture, manage, process or analysed. Enormous Data is a grouping of such a gigantic and complex information that it turns out to be... more

Big data is a collection of structured and unstructured data sets that include the huge quantities of data, social media analytics, data management capabilities, real-time data. For Big Data processing Hadoop uses Map Reduce paradigm.... more

As all we know that map reduce paradigm is became synonyms for computing big data problems like processing, generating and/or deducing large scale of data sets. Hadoop is a well know framework for these types of problems. The problems for... more

Bookmark
Download
- by Xue Tao
- •
- 7
  Computer Science, Image Processing, Machine Learning, Hadoop

Bookmark
Download
- by Vishal Dubey
- •
- 8
  Computer Science, Hadoop, Map Reduce, Big Data

Bookmark
Download
- by Agnes Mangundu
- •
- 5
  Urban Planning, Environmental Sustainability, Map Reduce, Reuse

Increasing performance and decreasing cost of microprocessors are making it feasible to move more processing power to the data source. This allows us to investigate new methods of storage delivery and storage management that were not... more

Big data may be a gather of structured, semi-structured and unstructured data sets that contain the large amount of data, social media analytics, information management ability, period of time information. For giant data processing Hadoop... more

Bookmark
Download
- by IJSRD Journal
- •
- 7
  Machine Learning, Hadoop, Map Reduce, Big Data

MapReduce is a parallel programming model and an associated implementation introduced by Google. In the programming model, a user specifies the computation by two functions, Map and Reduce. The underlying MapReduce library automatically... more

Hadoop is nothing but a “framework of tools” and it is a java based programming framework (In simple terms it is not software). The main target of hadoop is to process the large data sets into smaller distributed computing. It is part of... more

Log Analysis is a critical procedure in most framework and system exercises where log information is utilized for different reasons, for example, for execution checking, security examining or notwithstanding for revealing and profiling.... more

The Big Data is the new technology in the field of research in recent years and is not only big in amount , but also produced at speed and variety, which endeavors the research upsurge in multidisciplinary fields like Decision making ,... more

Bookmark
Download
- by IJCSMA Journal and +1
  Seema Rawat
- •
- 6
  Computer Science, Information Technology, Hadoop, Map Reduce

In an era of scientific progression, challenges in the field of Information Retrieval (IR) are wide spread and is so tedious to fathom, due to the increased usage of mass volumes of data. Hence, scalability and efficiency are the two main... more

We are in the age of big data which involves collection of large datasets.Managing and processing large data sets is difficult with existing traditional database systems.Hadoop and Map Reduce has become one of the most powerful and... more

Bookmark
Download
- by IJIRIS Journal Division
- •
- 10
  Hadoop, Map Reduce, Big Data, Big Data Analytics

Python, Pandas, data generators, map reduce and other tools applied to insurance industry in Mexico. Shows examples of reserves and net level premium calculation, run-off triangles with Python and Numpy and compliance reports generation... more

Bookmark
Download
- by Ivan Rodríguez Ortiz
- •
- 5
  Actuarial Science, Python, Map Reduce, ETL processes

Bookmark
Download
- by Nisrina Rainism
- •
- 5
  Wastewater, Map Reduce, Bacteria, Lead

Hadoop has become the de facto platform for large-scale data analysis in commercial applications, and increasingly so in scientific applications. However, Hadoop’s byte stream data model causes inefficiencies when used to process... more

Bookmark
Download
- by Scott Brandt
- •
- 11
  Computer Science, Data Analysis, Query Optimization, Distributed Systems

An ambient assisted living (AAL) system consists of heterogeneous sensors and devices which generate huge amounts of patient-specific unstructured raw data every day. An important feature of remote monitoring applications is to identify... more

Bookmark
Download
- by IAEME Publication
- •
- 7
  Data Mining, Hadoop, Knowledge Discovery, Map Reduce

Özetçe-Büyük veri içinde önemli yer tutan görüntü verisinin hizmet düzeyi sözleşmesinde belirtilen sürede işlenmesi önemli bir konu haline gelmiştir. Çoklu medya üretiminin ve kullanımının giderek arttığı bu çağda, görüntü verisinin hızlı... more

Özetçe-Büyük veri içinde önemli yer tutan görüntü verisinin hizmet düzeyi sözleşmesinde belirtilen sürede işlenmesi önemli bir konu haline gelmiştir. Çoklu medya üretiminin ve kullanımının giderek arttığı bu çağda, görüntü verisinin hızlı olarak işlenmesi için paralel ve dağıtık hesaplamanın kullanılması etkili bir çözüm sağlamaktadır. Görüntü işleme işlemini gerçekleştirmek için birçok teknik ve algoritma geliştirilmiştir. Bu teknik ve algoritmalar kütüphane haline getirilerek farklı lisanslar altında geliştiricilerin kullanımına sunulmuştur. Bu kütüphaneler sayesinde geliştiriciler daha hızlı şekilde uygulama geliştirebilmektedirler. Büyük veri olarak tanımlanabilecek düzeyde görüntü verisinin işlenmesi için büyük veri işleme teknikleri ile görüntü işleme kütüphanelerinin beraber kullanılması gerekmektedir. Bu çalışmada MapReduce tekniği, Hadoop Dağıtık Dosya Sistemi (Hadoop Distributed File System (HDFS)) ve Hadoop programlama çatısı ile çoklu medya işleme ve analizi için kullanılan OpenIMAJ kütüphanesi beraber kullanılarak dağıtık görüntü işleme işlemi gerçekleştirilmiştir. Yeniden tasarlanan Hadoop 2 mimarisi ile gelen YARN(Yet Another Resource Negotiator) konteynırlarının yapılandırma ayarları tartışılmış elde edilen sonuçlar değerlendirilmiştir. Anahtar Kelimeler-büyük veri; Hadoop; dağıtık görüntü işleme; MapReduce; OpenIMAJ. Abstract-The processing of image data, which has a significant place in big data, in the time specified in the service level agreement has become an important issue. In this age which multimedia production and usage has been rapidly increasing, the use of parallel and distributed computing for fast processing of image data provides an effective solution. Many techniques and algorithms have been developed to perform image processing. These techniques and algorithms were turned into libraries and released for developers to use under different licenses. Thanks to these libraries developers can develop applications faster. Big data processing techniques and image processing libraries have to be used together to handle image data that can be defined as big data. In this study, MapReduce technique, Hadoop Distributed File System (HDFS) and Hadoop programming framework with OpenIMAJ , which is used for multimedia processing and analysis, were used together for distributed image processing. The configuration settings of the YARN (Yet Another Resource Negotiator) containers that came with the redesigned Hadoop 2 architecture are discussed and the results obtained are discussed.

Bookmark
Download
- by Murat Tezgider
- •
- 6
  Image Processing, Hadoop, Map Reduce, Big Data

Bookmark
Download
- by Editor IJRET
- •
- 6
  Machine Learning, Distributed System, Hadoop, Map Reduce

Map Reduce

Log In