Details
-
Bug
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
2.1.0, 2.3.4
-
None
-
None
-
None
Description
The bug was found while working with configured environment: Apache Hadoop cluster and Apache Hive (the same issue on both versions 2.1 and 2.3).
Configuration of working environment was performed in the following steps:
- Installed Apache Hadoop cluster
- Installed over cluster - Hive component
- Configured properties in hive-site.xml
<property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:derby:;databaseName=test_db;create=true</value> <description>the URL of the Derby database</description> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>org.apache.derby.jdbc.EmbeddedDriver</value> <description>Driver class name for a JDBC metastore</description> </property> <property> <name>hive.metastore.warehouse.dir</name> <value>/usr/hive/warehouse</value> <description>location of default database for the warehouse</description> </property> <property> <name>datanucleus.schema.autoCreateAll</name> <value>true</value> <description>creates necessary schema on a startup if one doesnt exist.</description> </property> <property> <name>hive.metastore.schema.verification</name> <value>false</value> </property>
- Installed IJ utility to work with embedded DB, according to steps in the guide.
- Launched services: hiveserver2 and metastore
hive --service hiveserver2
hive --service metastore
- The next step was to start the utility in the next steps:
$ cd $DERBY_INSTALL/bin
$ . setEmbeddedCP
$ echo $CLASSPATH /opt/Apache/db-derby-10.14.2.0-bin/lib/derby.jar:/opt/Apache/db-derby-10.14.2.0-bin/lib/derbytools.jar:
- Start up ij with this command: ./ij
connect 'jdbc:derby:/opt/hadoop/hive/metastore_db';
But after performing that command appears the next error.
ij> connect 'jdbc:derby:/opt/hadoop/hive/metastore_db'; ERROR XJ040: Failed to start database '/opt/hadoop/hive/metastore_db' with class loader sun.misc.Launcher$AppClassLoader@5e2de80c, see the next exception for details. ERROR XSDB6: Another instance of Derby may have already booted the database /opt/hadoop/hive/metastore_db.
To resolve this error need to stop all hive services, like hiveserver2 and metastore. Whereupon again enter the command for connect to DB connect 'jdbc:derby:/opt/hadoop/hive/metastore_db'; that was successful.
- As far as metastore service was launched, he generated all tables lists in the package.jdo file, which we could see if perform the next command in ij utility: ij(CONNECTION1)> show tables;
ij> connect 'jdbc:derby:/opt/hadoop/hive/metastore_db'; ij(CONNECTION1)> show tables; TABLE_SCHEM |TABLE_NAME |REMARKS ------------------------------------------------------------------------ SYS |SYSALIASES | SYS |SYSCHECKS | SYS |SYSCOLPERMS | SYS |SYSCOLUMNS | SYS |SYSCONGLOMERATES | SYS |SYSCONSTRAINTS | SYS |SYSDEPENDS | SYS |SYSFILES | SYS |SYSFOREIGNKEYS | SYS |SYSKEYS | SYS |SYSPERMS | SYS |SYSROLES | SYS |SYSROUTINEPERMS | SYS |SYSSCHEMAS | SYS |SYSSEQUENCES | SYS |SYSSTATEMENTS | SYS |SYSSTATISTICS | SYS |SYSTABLEPERMS | SYS |SYSTABLES | SYS |SYSTRIGGERS | SYS |SYSUSERS | SYS |SYSVIEWS | SYSIBM |SYSDUMMY1 | APP |BUCKETING_COLS | APP |CDS | APP |COLUMNS_V2 | APP |DATABASE_PARAMS | APP |DBS | APP |GLOBAL_PRIVS | APP |PARTITIONS | APP |PARTITION_KEYS | APP |PARTITION_KEY_VALS | APP |PARTITION_PARAMS | APP |PART_COL_STATS | APP |ROLES | APP |SDS | APP |SD_PARAMS | APP |SEQUENCE_TABLE | APP |SERDES | APP |SERDE_PARAMS | APP |SKEWED_COL_NAMES | APP |SKEWED_COL_VALUE_LOC_MAP | APP |SKEWED_STRING_LIST | APP |SKEWED_STRING_LIST_VALUES | APP |SKEWED_VALUES | APP |SORT_COLS | APP |TABLE_PARAMS | APP |TAB_COL_STATS | APP |TBLS | APP |VERSION | 50 rows selected
- But if instead of start/stop services use the next command: /opt/hadoop/hive/bin/schematool -dbType derby -initSchema.
The result will be fundamentally different because the this command involve hive-schema-2.1.0.mysql.sql script that generate the following tables:
ij(CONNECTION1)> show tables; TABLE_SCHEM |TABLE_NAME |REMARKS ------------------------------------------------------------------------ SYS |SYSALIASES | SYS |SYSCHECKS | SYS |SYSCOLPERMS | SYS |SYSCOLUMNS | SYS |SYSCONGLOMERATES | SYS |SYSCONSTRAINTS | SYS |SYSDEPENDS | SYS |SYSFILES | SYS |SYSFOREIGNKEYS | SYS |SYSKEYS | SYS |SYSPERMS | SYS |SYSROLES | SYS |SYSROUTINEPERMS | SYS |SYSSCHEMAS | SYS |SYSSEQUENCES | SYS |SYSSTATEMENTS | SYS |SYSSTATISTICS | SYS |SYSTABLEPERMS | SYS |SYSTABLES | SYS |SYSTRIGGERS | SYS |SYSUSERS | SYS |SYSVIEWS | SYSIBM |SYSDUMMY1 | APP |AUX_TABLE | APP |BUCKETING_COLS | APP |CDS | APP |COLUMNS | APP |COLUMNS_V2 | APP |COMPACTION_QUEUE | APP |COMPLETED_COMPACTIONS | APP |COMPLETED_TXN_COMPONENTS | APP |DATABASE_PARAMS | APP |DBS | APP |DB_PRIVS | APP |DELEGATION_TOKENS | APP |FUNCS | APP |FUNC_RU | APP |GLOBAL_PRIVS | APP |HIVE_LOCKS | APP |IDXS | APP |INDEX_PARAMS | APP |KEY_CONSTRAINTS | APP |MASTER_KEYS | APP |NEXT_COMPACTION_QUEUE_ID | APP |NEXT_LOCK_ID | APP |NEXT_TXN_ID | APP |NOTIFICATION_LOG | APP |NOTIFICATION_SEQUENCE | APP |NUCLEUS_TABLES | APP |PARTITIONS | APP |PARTITION_EVENTS | APP |PARTITION_KEYS | APP |PARTITION_KEY_VALS | APP |PARTITION_PARAMS | APP |PART_COL_PRIVS | APP |PART_COL_STATS | APP |PART_PRIVS | APP |ROLES | APP |ROLE_MAP | APP |SDS | APP |SD_PARAMS | APP |SEQUENCE_TABLE | APP |SERDES | APP |SERDE_PARAMS | APP |SKEWED_COL_NAMES | APP |SKEWED_COL_VALUE_LOC_MAP | APP |SKEWED_STRING_LIST | APP |SKEWED_STRING_LIST_VALUES | APP |SKEWED_VALUES | APP |SORT_COLS | APP |TABLE_PARAMS | APP |TAB_COL_STATS | APP |TBLS | APP |TBL_COL_PRIVS | APP |TBL_PRIVS | APP |TXNS | APP |TXN_COMPONENTS | APP |TYPES | APP |TYPE_FIELDS | APP |VERSION | APP |WRITE_SET | 81 rows selected
As a result we could see that property that was used in the hive-site.xml doesn’t work properly and we should avoid using her. That is we must use only schematool.
<property> <name>datanucleus.schema.autoCreateAll</name> <value>true</value> <description>creates necessary schema on a startup if one doesnt exist.</description> </property>