Showing posts with label hive. Show all posts
Showing posts with label hive. Show all posts

Tuesday, 30 May 2017

Comparing if two tables have identical content

Have you ever wanted to know if two database tables were the same? Perhaps you made some changes to a query and wanted to know if the result was still the same.

I was essentially looking for a way to diff two tables in Hive SQL, and used:
(TABLE a EXCEPT TABLE b)
UNION ALL
(TABLE b EXCEPT TABLE a);

Sunday, 11 December 2016

Find the space used by a Hive table

There are a few ways to identify the space used by a Hive table.

Unfortunately, my Hive wouldn't accept these commands (and I haven't worked out why...):
  • hive > SHOW TBLPROPERTIES <TableName>("rawDataSize")
  • hive > DESCRIBE EXTENDED <TableName>
So I resorted to using the file system shell:
  • hadoop fs -du -h <URI>