SHMstor: A Scalable Hadoop/MapReduce-based Storage System for Small-Write Efficiency

	SHMstor: A Scalable Hadoop/MapReduce-based Storage System for Small-Write Efficiency
	Lingfang Zeng; Wei Shi; Fan Ni; Song Jiang; Xiaopeng Fan; Chengzhong Xu; Yang Wang
	2018
会议日期	2018
会议地点	广州
英文摘要	Abstract—It is well known that small files are often created and accessed in pervasive computing in which information is processed with limited resources via linking with objects as encountered. And the Hadoop framework, as a de facto big data processing platform though very popular in practice, cannot effectively process the small files. In this paper, we propose a scalable HDFS-based storage framework, named SHAstor, to improve the throughput in processing of small-writes for pervasive computing paradigm. Compared to the classic HDFS, the essence of this approach is to merge the incoming small writes into a large chunk of data, either at client side or at server side, and then store it as a big file in the framework. As a consequence, this could substantially reduce the number of small files to process the pervasively gathered information. To reach this goal, the framework takes the HDFS as the basis and adds three extra modules for merging and indexing the small files during the read/write operations in pervasive applications are performed. To further facilitate this process, a new ancillary namenode is also optionally installed to store the index table. With this optimization, SHAstor can not only optimize the small-writes, but also scale out with the number of datanodes to improve the performance of pervasive applications.
语种	英语
URL标识	查看原文
内容类型	会议论文
源URL	[http://ir.siat.ac.cn:8080/handle/172644/14129]
专题	深圳先进技术研究院_数字所
推荐引用方式 GB/T 7714	Lingfang Zeng,Wei Shi,Fan Ni,et al. SHMstor: A Scalable Hadoop/MapReduce-based Storage System for Small-Write Efficiency[C]. 见:. 广州. 2018.