- Create Table AS (CTAS), does it distribute data randomly or based on table on which it received data.
Newly created table from CTAS gets distribution from the original table.
- Which one is better DELETE or TRUNCATE? Why?
Of course TRUNCATE. If you use DELETE, Greenplum will not remove the data; instead it’ll logically remove by flagging xmax with XID.
- Can we update all columns in Greenplum?
No. Distribution key columns may not be updated.
- When no distribution clause is used while creating a table, what distribution is used by Greenplum?
Uses hash distribution on first column. Same as Netezza and Teradata
- What is the segment in Greenplum?
Database instances in the Greenplum system are called segments. Segment stores the data and carry out the query processing. In Greenplum distributed systems, each segment contains a distinct portion of the data.
- What are gangs in Greenplum?The processes that are working on the same slice of the query plan but on different segments are called gangs.
- Can you create zero-column table in Greenplum?
Yes. We can create a table with no columns. For example,
training=> create table test();
NOTICE: Table has no attributes to distribute on.
- Upto how many columns can you provide in DISTRIBUTED BY clause?
You can provide as many column as you have in table. However, it is recommended to use max two columns in DISTRIBUTED BY clause
How do you generate the DDL for the particular table?
use pg_dump utility with option -s and -t.
pg_dump -U user_name -h host database -s -t test -f test.sql
- How do you derive columns from previously computed column values?
You can use Greenplum WITH or Common Table Expressions (CTE) to derive columns from previously computed columns.