Viktors Rotanovs home

HBase Dump and Restore

Cookbook-type snippets to export data out of HBase or import dumps into HBase.

Export a table from HBase into local filesystem:

bin/hbase org.apache.hadoop.hbase.mapreduce.Driver export \
table_name /local/path

Export a table from HBase into HDFS:

bin/hbase org.apache.hadoop.hbase.mapreduce.Driver export \
table_name hdfs://namenode/path

Import a table from a local dump into existing HBase table:

bin/hbase org.apache.hadoop.hbase.mapreduce.Driver import \
table_name /local/path

It’s a good idea to count and compare number of rows before exporting and after importing:

bin/hbase org.apache.hadoop.hbase.mapreduce.Driver \
rowcounter table_name

Number of rows is visible in Hadoop counter called ROWS, like in output below:

mapred.JobClient:   org.apache.hadoop.hbase.mapreduce.RowCounter$RowCounterMapper$Counters
mapred.JobClient:     ROWS=103821

Alternatively, you can use Hadoop Tool interface, but it may complain about missing classes if hadoop-env.sh is not configured properly. For example, when launched without arguments, it displays available options:

hadoop jar hbase-0.20.3.jar
An example program must be given as the first argument.
Valid program names are:
export: Write table data to HDFS.
hsf2sf: Bulk convert 0.19 HStoreFiles to 0.20 StoreFiles
import: Import data written by Export.
rowcounter: Count rows in HBase table

HBase dump is one ore more Hadoop SequenceFiles, you can inspect its contents with something like:

hadoop fs -fs local -text table_name/part-m-00000
blog comments powered by Disqus
Fork me on GitHub