Idle Time Estimation for Bandwidth-Efficient Synchronization in Replicated Distributed File System

Abstract: Synchronization is a promising approach to solve the consistency problems in replicated distributed file systems. The synchronization can be repeated periodically, with fixed time interval or a time interval which can be adjusted adaptively. In this paper, we propose a policy-based performance efficient distributed file synchronization approach, in which synchronization processes occur in varying … Read more

Hadoop Optimization for Massive Image Processing: Case Study Face Detection

Abstract: Face detection applications are widely used for searching, tagging and classifying people inside very large image databases. This type of applications requires processing of relatively small sized and large number of images. On the other hand, Hadoop Distributed File System (HDFS) is originally designed for storing and processing large- size files. Huge number of … Read more

Hadoop Plugin For Distributed and Parallel Image Processing

Hadoop Distributed File System (HDFS) is widely used in large-scale data storage and processing. HDFS uses MapReduce programming model for parallel processing. The work presented in this paper proposes a novel Hadoop plugin to process image files with MapReduce model. The plugin introduces image related I/O formats and novel classes for creating records from input … Read more