Face detection applications are widely used for searching, tagging and classifying
people inside very large image databases. This type of applications requires processing
of relatively small sized and large number of images. On the other hand, Hadoop
Distributed File System (HDFS) is originally designed for storing and processing large-
size files. Huge number of small-size images causes slowdown in HDFS by increasing
total initialization time of jobs, scheduling overhead of tasks and memory usage of the
file system manager (Namenode). The study in this paper presents two approaches to
improve small image file processing performance of HDFS. These are (1) converting
the images into single large-size file by merging and (2) combining many images for a
single task without merging. We also introduce novel Hadoop file formats and record
generation methods (for reading image content) in order to develop these techniques.
Keywords: Hadoop, MapReduce, Cloud Computing, Face Detection.
Büyük Veri, Paralel İşleme ve Akademisyenlik [Link]
Veri Analitiği & Büyük Veri [Link]