Sql #1: sql quick reference table

DDL Statements (Data Definition Language) πŸ”—︎

CREATE TABLE πŸ”—︎

  • Basic Usage

    • Partitioning: Logically, there is only one table or index, but in reality, there are multiple physical partitions.
    • Lifecycle: The lifecycle is calculated from the last time the partition data was modified. After a partitioned table is retired, it won’t be deleted.
    • Example: Create a partitioned table with a lifecycle of 180 days. You can specify the partition if there is one, or skip it if there isn’t.
    CREATE TABLE IF NOT EXISTS test(
      a STRING, b BIGINT, d STRING
    ) PARTITIONED BY(ds STRING) 
    LIFECYCLE 180
    
  • Advanced Usage

    • Create a table based on another table’s structure: The new table will have the same structure and comments, but a different lifecycle.
    CREATE TABLE IF NOT EXISTS test LIKE bbb
    
    • Create a table based on a SELECT result: No partitions are included.
    CREATE TABLE ttt
    AS
    SELECT a, b
    FROM sss
    WHERE ds = '20180212'
    

DROP TABLE πŸ”—︎

  • Drop a table directly, without recovery:
DROP TABLE IF EXISTS table_name

ALTER TABLE πŸ”—︎

  • Rename the table:
ALTER TABLE table_name RENAME TO new_table_name
  • Add or drop partitions:
ALTER TABLE table_name ADD IF NOT EXISTS 
PARTITION partition_spec (partition_col1=partition_col1_value1,...)
ALTER TABLE table_name DROP IF EXISTS 
PARTITION partition_spec (partition_col1=partition_col1_value1, ...)
  • Modify table structure:

    • Change the table comment:
    ALTER TABLE table_name SET COMMENT 'table_comments'
    
    • Add columns to the table:
    ALTER TABLE table_name ADD COLUMNS (col1 type, col2 type, ...)
    
    • Rename a column:
    ALTER TABLE table_name CHANGE COLUMN old_col RENAME TO new_col
    
    • Modify column comments (including partition column comments):
    ALTER TABLE table_name CHANGE COLUMN col1 COMMENT 'comments'
    
    • Change the table lifecycle:
    ALTER TABLE table_name SET LIFECYCLE days;
    

View Table Structure πŸ”—︎

DESC table_name

DML Statements (Data Manipulation Language) πŸ”—︎

SELECT πŸ”—︎

  • Basic Query:
SELECT 
  a 
FROM 
  aa 
WHERE 
  ds = '20120201'
  • Limit the number of rows:
SELECT 
  a 
FROM 
  aa 
WHERE 
  ds = '20120201' 
LIMIT 
  10
  • Distinct (Remove duplicates):
SELECT 
  DISTINCT a 
FROM 
  aa 
WHERE 
  ds = '20120201'
  • Data aggregation: Aggregate functions: SUM, COUNT, AVG, MAX/MIN
SELECT 
  a, 
  COUNT(1) AS CNT 
FROM 
  aa 
WHERE 
  ds = '20120201' 
GROUP BY 
  a 
  • Filter aggregated results:
SELECT 
  a, 
  COUNT(1) AS CNT 
FROM 
  aa 
WHERE 
  ds = '20120201' 
GROUP BY 
  a 
HAVING 
  COUNT(1) > 1000
  • Subquery: Use a query result as a subtable
SELECT 
  a 
FROM 
  (
    SELECT 
      b AS b_rename 
    FROM 
      btable 
    WHERE 
      ds = '20120202' 
      AND cc = '1' 
    GROUP BY 
      b
  ) atable 
WHERE 
  atable.cc > 1000
  • Merge data from multiple tables:
    The tables used in UNION ALL must have the same column names and types.
SELECT 
  a 
FROM 
  (
    SELECT 
      b AS b_rename 
    FROM 
      btable 
    WHERE 
      ds = '20120202' 
      AND cc = '1' 
    UNION ALL 
    SELECT 
      c AS b_rename 
    FROM 
      ctable 
    WHERE 
      ds = '20100202'
  ) atable 
WHERE 
  atable.cc > 1000
  • Join operations: Join multiple tables for data analysis.
    • Inner Join: Only shows records where there is a complete match (intersection).
SELECT a.* FROM a JOIN b ON (a.id=b.id)
  • Left Outer Join: All records from the left table are selected; unmatched records from the right table are assigned null values.
  • Right Outer Join: All records from the right table are selected; unmatched records from the left table are assigned null values.
  • Full Outer Join: All records from both tables are selected; unmatched records are assigned null values.

INSERT πŸ”—︎

(Details not provided)

DCL Statements (Data Control Language) πŸ”—︎

Permission control statements.

Functions πŸ”—︎

System-Provided and User-Defined Functions πŸ”—︎

Date Formatting πŸ”—︎

TO_CHAR('2013-03-01 00:00:02', 'yyyymmdd')

Date Arithmetic πŸ”—︎

DATETIME DATEADD(DATETIME date, BIGINT delta, STRING datepart)
datepart options: dd, mm, yyyy, hh, mi, ss

Example: DATEADD('2005-02-28 00:00:12', 1, 'dd')

String Concatenation πŸ”—︎

CONCAT('ab', 'c')

Sort by a Field πŸ”—︎

ROW_NUMBER() OVER(PARTITION BY col1 ORDER BY col2 DESC)

SELECT 
  a, 
  b, 
  ROW_NUMBER() OVER(
    PARTITION BY a 
    ORDER BY 
      b DESC
  ) AS rnk 
FROM 
  (
    SELECT 
      ...
  ) atable

Get the Latest Partition of a Table πŸ”—︎

SELECT * FROM xx WHERE ds=MAX_PT('xx')

Be the first to know when I post cool stuff

Subscribe to get my latest articles. No spam, unsubscribe anytime.