Hadoop Optimization for Massive Image Processing: Case Study Face Detection

Abstract: Face detection applications are widely used for searching, tagging and classifying people inside very large image databases. This type of applications requires processing of relatively small sized and large number of images. On the other hand, Hadoop Distributed File System (HDFS) is originally designed for storing and processing large- size files. Huge number of … Read more

En Uzun Ortak Küme Algoritmasıyla Dağıtık Görüntü Eşleme

Büyük ölçekli paralel ve dağıtık hesaplama donanımlarının yanında, birçok görüntü işleme algoritmaları gündelik hayatta kullanılan sıradan bilgisayarlar üzerinde de geliştirilebilir. Bu anlamda, MapReduce Google tarafından önerilen paralel hesaplama modelidir. Görüntü birleştirme (Eng. registration) etme olarak da adlandırılan görüntü eşleştirme işlemi en çok karşılaşılan görüntü işleme çalışmalarından biridir. Görüntü birleştirme, görüntülerin tek ortak bir düzleme düzenlenmesi … Read more

Vectorization of Large Amounts of Raster Satellite Images in a Distributed Architecture Using HIPI

Abstract—Vectorization processes focus on grouping pixels of a raster image into raw line segments, and forming lines, polylines or polygons. To vectorize massive raster images regarding resource and performance problems, we use a distributed HIPI image processing interface based on MapReduce approach. Apache Hadoop is placed at the core of the framework. To realize such … Read more

A Study for Adaptation of Image Stitching to Big Data Frameworks

In this study, we adopt image stitching process to bigdata frameworks. To do so, an algorithm is presented to merge two large images in accordance with Hadoop’s map/reduce computation paradigm. Images are first converted to bitmaps which are represented as matrices of 0s and 1s. The algorithm then finds the best possible match among two … Read more

A scalable distributed query framework for unstructured big clinical data: A case study on diabetic records

Abstract—Unstructured data forms close to 80% of information in the healthcare industry and is growing exponentially. Analyzing and querying of those type of data is not efficient with traditional relational database technologies. In this paper, we propose a distributed and scalable big data framework for querying and analyzing of unstructured clinical data. The framework is … Read more

MapReduce Based Scalable Range Query Architecture for Big Spatial Data

Abstract-Finding all objects that overlap a given range query is very important in terms of extraction useful information from big spatial data. In this study, in order to be able to realize range query on large amounts of spatial data, three datasets are created with different size and a MapReduce computation model is set up … Read more