HDFS Connector
This chapter describes how to use the Tibero HDFS Connector.
HDFS External Table Creation
The syntax for creating an external table using the HDFS Connector is the same as the general syntax for creating an external table.
In the previous syntax, Port is optional and 8020 is its default value.
The following is an example of defining an external table using the HDFS Connector.
The previous example creates three HDFS files for the external table. The host name of the HDFS Namenode is 'hadoop_name', and the port number defaults to 8020. Next, it reads the three files, 'f0.txt', 'f1.txt', and 'f2.txt', from the '/user/tibero' directory below HDFS.
The directory object is only needed for the DDL syntax to create the external table, and it has no effect on the HDFS file path. Local files as well as HDFS files can be used to create an external table, and all functions available to external tables can be used in the same way.
For more information about external table creation syntax, refer to "Tibero SQL Reference Guide".
Querying with HDFS Connector
To query with HDFS Connector, create an external table and execute queries against it.
External Table interface enables the use of all query functions provided by Tibero, join operations with a Tibero table, various aggregate functions and UDF.
The following is an example of using the HDFS Connector to execute a query.
The /*+ parallel */ hint can also be used for parallel execution as with general tables.
Parallel execution can enhance performance since it uses parallel table scans by dividing HDFS files into HDFS block units. However, DML cannot be executed against external tables of HDFS Connector.
Last updated