Skip to main content
Abstract—Cloud Platforms allow programmers to write ap-plications that run in the cloud, or use services from the Cloud, or both while abstracting the essence of scalability and distributed processing. With the emergence of Clouds as a... more
We present a virtualized setup of a Hadoop cluster that provides greater computing capacity with lesser resources, since a virtualized cluster requires fewer physical machines. The master node of the cluster is set up on a physical... more
Hadoop é o principal framework usado para processar e gerenciar grandes quantidades de dados. Qualquer pessoa que trabalhe com programação ou ciência de dados deve se familiarizar com a plataforma.
Increasing performance and decreasing cost of microprocessors are making it feasible to move more processing power to the data source. This allows us to investigate new methods of storage delivery and storage management that were not... more
During the recent years, a number of efficient and scalable frequent itemset mining algorithms for big data analytics have been proposed by many researchers. Initially, MapReduce-based frequent itemset mining algorithms on Hadoop cluster... more
Advancement in information and communication technology (ICT) has given rise to explosion of data in every field of operations. Working with the enormous volume of data (or Big Data, as it is popularly known as) for extraction of useful... more
The last days, the data and internet are become increasingly growing which occurring the problems in big-data. For these problems, there are many software frameworks used to increase the performance of the distributed system. This... more
In the existing search results the search engine users search there results for different goals. It will be very useful for improving search engine relevance and user experience. In existing system there are so many Data mining algorithm... more
Big Data konsepti ve bu konsept için kullanılan genel araçlardan bahsettim. NoSQL konusunu ayrıca işlediğim için burada o konuya değinmedim.
Este trabalho apresenta um sistema distribuído construído para realizar o armazenamento e a indexação dos dados resultantes de uma análise forense em dispositivo de armazenamento computacional. Com o passar dos anos, a quantidade de dados... more
Map-Reduce is a programming model and an associated implementation for processing and generating large data sets. This model has a single point of failure: the master, who coordinates the work in a cluster. On the contrary, wireless... more
This paper presents the design, implementation, and operation of a novel distributed fault-tolerant middleware. It uses interconnected WSNs that implement the Map-Reduce paradigm, consisting of several low-cost and low-power... more
Weather is the most critical for human in many aspects of life. The study and knowledge of how weather Temperature evolves over time in some location or country in the world can be beneficial for several purposes. Processing, Collecting... more
Internet becomes the most popular surfing environment which increases the service oriented data size. As the data size grows, finding and retrieving the most similar data from the large volume of data would become more difficult task.... more
Data mining is the process of analyzing data from different perspectives and summarizing it into useful information the patterns, associations, or relationships among all this data can provide information. Spatial Data Mining (SDM) is the... more
Data is being produced by the firms in ever increasing rates and firms are finding new ways to make use of data to create business value. The generated volumes of data create the need for better and cheaper storage options that allows... more
The major challenge of Big Data is extracting the useful information from the terabytes of raw data and analyzing the extracted information which is essential for the decision making. The above factor can be found in the proposed system... more
Internet becomes the most popular surfing environment which increases the service oriented data size. As the data size grows, finding and retrieving the most similar data from the large volume of data would become more difficult task.... more
Existing studies in data mining focus on Outlier detection on data with single clustering algorithm mostly. There are lots of Clustering methods available in data mining. The values or objects that are similar to each other are organized... more
In recent days generating data transfer become faster than ever. Need to maintain huge datasets, systems are increasingly interconnected. The Big data is the collection of large data sets, like billions of billion data stored and accessed... more
The HDFS (Hadoop Distrubuted File System) and Hadoop components are often preferred amonglarge data sets. Nowadays, we see that the big data relationship is more popular in softwareengineering, food engineering, hardware engineering,... more
Map Reduce is a practical model used for processing the large scale data that is the huge volume data at a very high speed. It is parallel processing programming model helping in achieving near real time results. Designed efficiently by... more
Cloud computing has emerged as a model that harnesses massive capacities of data centers to host services in a cost-effective manner. MapReduce has been widely used as a Big Data processing platform, proposed by Google in 2004 and has... more
Data-intensive systems encompass terabytes to petabytes of data. Such systems require massive storage and intensive computa- tional power in order to execute complex queries and generate timely results. Further,... more
Opinion Mining (OM) is a field of Natural Language Processing (NLP) that aims to capture human sentiment in the given text. With the ever-spreading of online purchasing websites, micro-blogging sites, and social media platforms, OM in... more
The term 'Big Data', refers to data sets whose size, complexity, and growth rate make them difficult to capture, manage, process or analysed. Enormous Data is a grouping of such a gigantic and complex information that it turns out to be... more
Big data is a collection of structured and unstructured data sets that include the huge quantities of data, social media analytics, data management capabilities, real-time data. For Big Data processing Hadoop uses Map Reduce paradigm.... more
As all we know that map reduce paradigm is became synonyms for computing big data problems like processing, generating and/or deducing large scale of data sets. Hadoop is a well know framework for these types of problems. The problems for... more
Increasing performance and decreasing cost of microprocessors are making it feasible to move more processing power to the data source. This allows us to investigate new methods of storage delivery and storage management that were not... more
Big data may be a gather of structured, semi-structured and unstructured data sets that contain the large amount of data, social media analytics, information management ability, period of time information. For giant data processing Hadoop... more
MapReduce is a parallel programming model and an associated implementation introduced by Google. In the programming model, a user specifies the computation by two functions, Map and Reduce. The underlying MapReduce library automatically... more
Hadoop is nothing but a “framework of tools” and it is a java based programming framework (In simple terms it is not software). The main target of hadoop is to process the large data sets into smaller distributed computing. It is part of... more
Log Analysis is a critical procedure in most framework and system exercises where log information is utilized for different reasons, for example, for execution checking, security examining or notwithstanding for revealing and profiling.... more
The Big Data is the new technology in the field of research in recent years and is not only big in amount , but also produced at speed and variety, which endeavors the research upsurge in multidisciplinary fields like Decision making ,... more
In an era of scientific progression, challenges in the field of Information Retrieval (IR) are wide spread and is so tedious to fathom, due to the increased usage of mass volumes of data. Hence, scalability and efficiency are the two main... more
We are in the age of big data which involves collection of large datasets.Managing and processing large data sets is difficult with existing traditional database systems.Hadoop and Map Reduce has become one of the most powerful and... more
Python, Pandas, data generators, map reduce and other tools applied to insurance industry in Mexico. Shows examples of reserves and net level premium calculation, run-off triangles with Python and Numpy and compliance reports generation... more
Hadoop has become the de facto platform for large-scale data analysis in commercial applications, and increasingly so in scientific applications. However, Hadoop’s byte stream data model causes inefficiencies when used to process... more
An ambient assisted living (AAL) system consists of heterogeneous sensors and devices which generate huge amounts of patient-specific unstructured raw data every day. An important feature of remote monitoring applications is to identify... more
Özetçe-Büyük veri içinde önemli yer tutan görüntü verisinin hizmet düzeyi sözleşmesinde belirtilen sürede işlenmesi önemli bir konu haline gelmiştir. Çoklu medya üretiminin ve kullanımının giderek arttığı bu çağda, görüntü verisinin hızlı... more