First query execution¶
First of all, we need to prepare some table for query execution. For example, you can make a simple text-based table as follows:
$ mkdir /home/x/table1 $ cd /home/x/table1 $ cat > data.csv 1|abc|1.1|a 2|def|2.3|b 3|ghi|3.4|c 4|jkl|4.5|d 5|mno|5.6|e <CTRL + D>
Apache Tajo™ provides a SQL shell which allows users to interactively submit SQL queries. In order to use this shell, please execute bin/tsql
$ $TAJO_HOME/bin/tsql tajo>
In order to load the table we created above, we should think of a schema of the table. Here, we assume the schema as (int, text, float, text).
$ $TAJO_HOME/bin/tsql tajo> create external table table1 ( id int, name text, score float, type text) using csv with ('csvfile.delimiter'='|') location 'file:/home/x/table1';
To load an external table, you need to use ‘create external table’ statement. In the location clause, you should use the absolute directory path with an appropriate scheme. If the table resides in HDFS, you should use ‘hdfs’ instead of ‘file’.
If you want to know DDL statements in more detail, please see Query Language.
tajo> \d table1 ``\d`` command shows the list of tables. :: tajo> \d table1 table name: table1 table path: file:/home/x/table1 store type: CSV number of rows: 0 volume (bytes): 78 B schema: id INT name TEXT score FLOAT type TEXT
\d [table name] command shows the description of a given table.
Also, you can execute SQL queries as follows:
tajo> select * from table1 where id > 2; final state: QUERY_SUCCEEDED, init time: 0.069 sec, response time: 0.397 sec result: file:/tmp/tajo-hadoop/staging/q_1363768615503_0001_000001/RESULT, 3 rows ( 35B) id, name, score, type - - - - - - - - - - - - - 3, ghi, 3.4, c 4, jkl, 4.5, d 5, mno, 5.6, e tajo> exit bye
Feel free to enjoy Tajo with SQL standards. If you want to know more explanation for SQL supported by Tajo, please refer SQL Language.