Showing posts with label Hadoop. Show all posts
Showing posts with label Hadoop. Show all posts

Wednesday, May 2, 2018

Zookeeper Issue : Last transaction was partial

Zookeeper Issue : Last transaction was partial


Issue: ZooKeeper was continuously crashing with below error

2018-03-23 12:20:53,374 ERROR org.apache.zookeeper.server.persistence.Util: Last transaction was partial.
2018-03-23 12:20:53,375 ERROR org.apache.zookeeper.server.ZooKeeperServerMain: Unexpected exception, exiting abnormally

Brief: Like any other transnational system; zookeeper works just like any other software where any transactions that are related to state of zookeeper will be written first to the disk and then its updates takes place. When the transaction log file reaches a certain size, a new transaction log file gets created.

ZooKeeper stores its data in a data directory and its transaction log in a transaction log directory.

Data Directory
  • myid - contains a single integer in human readable ASCII text that represents the server id.
  • snapshot.<zxid> - holds the fuzzy snapshot of a data tree.


Log Directory

The Log Directory contains the ZooKeeper transaction logs. Before any update takes place, Zookeeper ensures that the transaction that represents the update is written to non-volatile storage. 
A new log file is started each time a snapshot is begun. The log file's suffix is the first zxid written to that log.

After talking to the team and getting info what all was done prior this error; came to know Hadoop eco system was aborted once during one transaction activity. 
Since the termination was abnormal and issue was related to transaction entry so I rush to LOG directory and saw Zookeeper transactions log file was sized 0<junk file> ; which means that while booting up ; zookeeper was trying apply logs for the consistent state but due to 0 sized it didn’t manage to replay the transactions and hence failed to start.

-rw-r--r-- 1 zookeeper zookeeper       0 Mar 23 03:17 log.da092

Solution: I removed the junk log file and started Zookeeper again and succeeded. Simple enough!!!!

Writing to transaction log files is not efficient step for a heavy loaded system, because let say on startup zookeeper would have to replay every transaction it ever processed.

So periodically, zookeeper will write a snapshot of the current state of its in memory database to file.

In short both snapshot and transactions logs are very important to zookeeper.

Enjoy Learning!!!

Tuesday, December 13, 2016

HBASE: Table already exists

HBASE: Table already exists


Last weekend while expermenting on Hbase component of Hadoop ecosystem; and i came across an issue while creating table.

=> Hbase::Table - emp
hbase(main):003:0> create 'emp', 'personal data', 'professional data'

ERROR: Table already exists: emp!

Cool so table is already present; that’s ok; might be it is already present; sounds easy peasy right but somehow it turned out to be very interesting for me.

At first i checked its existence and i listed all the table and unfortunately it’s not there.

hbase(main):004:0> list
TABLE
emp_v
random_test
temp_test

So to outsmart Hbase:) i tried deletion

hbase(main):005:0> drop 'emp'

ERROR: Table emp does not exist.

Excellent so in one context Hbase table is not present but it is present sounds whacky right; but all in all its good there is something for me to dig in and there is a saying "They Say Half Knowledge is a Dangerous thing"; so why not add another half and make it full.

I went ahead and tried to see if there is any trace of a file which might left behind and from that Hbase might be reading.

I checked that and I saw only one namespace i.e. "default"


#hdfs dfs -ls /hbase/data

drwxr-xr-x   - hbase hbase          0 2016-12-11 20:51 /hbase/data/default

you can see nowhere a single directory.


# hdfs dfs -ls /hbase/data/default

drwxr-xr-x   - hbase hbase          0 2016-12-08 18:20 /hbase/data/default/emp_v
drwxr-xr-x   - hbase hbase          0 2016-12-09 13:45 /hbase/data/default/random_test
drwxr-xr-x   - hbase hbase          0 2016-12-09 00:08 /hbase/data/default/temp_test


So above didn’t helped either; but this is true that Hbase is reading it from somewhere but where; i started from basics that how client coordinate with Hbase and this proved to be fruitful step for me.

Way forward was "The Zookeeper".

A distributed HBase relies completely on Zookeeper (for cluster configuration and management). In HBase, Zookeeper coordinates, communicates, and shares state between the Masters and Region Servers.

Zookeeper is a client/server system for distributed coordination that exposes an interface similar to a filesystem, where each node (called a Znode) may contain data and a set of children. 

Each Znode has a name and can be identified using a filesystem-like path (for example, /root-znode/sub-znode/my-znode).

Znode which seems to be important for this context 

Znode
Description
hbase/table (zookeeper.znode.masterTableEnableDisable)
Used by the master to track the table state during assignments (disabling/enabling states, for example).



Zookeeper can be accessed via zkcli and i logged in to zookeeper client via hbase zkcli or you may use zookeeper-client command.

i did ls command and issue seems cracked; here is the bright and shiny emp table.

ls /hbase/table
[ random_test, emp,  hbase:meta,  hbase:namespace, temp_test, emp_v] 

As I mentioned above as part of Znode that HBase stores some data schema and state of each table in zookeeper to be able to coordinate between all the region servers .
You might be wondering why table metadata still present although dropped from Hbase??
What is the solution??
Since it seems an orphan kind of table to me; so why not remove the metadata from zookeeper; Zookeeper provides rmr command to remove a specified Znode.
So from same login  I executed below command which expected to remove a Znode i.e. <Metadata of emp table>

rmr /hbase/emp  

and I checked ls command again that whether emp table is still present or not; finally table is removed a big relief to me.

ls /hbase/table
[ random_test,  hbase:meta,  hbase:namespace, temp_test, emp_v] 

And I created table again and this time I succeeded J

hbase(main):001:0>  create 'emp', 'personal data', 'professional data'
0 row(s) in 2.5890 seconds

=> Hbase::Table - emp