I read the below from Developers exhange earlier (for 13.10), wondering whether these guidelines are still good with 14.10 version too, or anything changed?
Guidelines for Stats Collection
For a partitioned table, it is recommended that you always collect statistics on:
· PARTITION: This tells the optimizer how many row partitions are empty, a histogram of how many rows are in each row partition, and the compression ratio for column partitions. This statistic is used for optimizer costing.
· Any partitioning columns. This provides cardinality estimates to the optimizer when the partitioning column is part of a query’s selection criteria
For a partitioned primary index table, consider collecting these statistics if the partitioning column is not part of the table’s primary index (PI):
· (PARTITION, PI). This statistic is most important when a given PI value may exist in multiple partitions, and can be skipped if a PI value only goes to one partition. It provides the optimizer with the distribution of primary index values across the partitions. It helps in costing the sliding-window and rowkey-based merge join, as well as dynamic partition elimination.
· (PARTITION, PI, partitioning column). This statistic provides the combined number of distinct values for the combination of PI and partitioning columns after partition elimination. It is used in rowkey join costing.
I read the below from Developers exhange earlier (for 13.10), wondering whether these guidelines are still good with 14.10 version too, or anything changed?
Guidelines for Stats Collection
For a partitioned table, it is recommended that you always collect statistics on:
· PARTITION: This tells the optimizer how many row partitions are empty, a histogram of how many rows are in each row partition, and the compression ratio for column partitions. This statistic is used for optimizer costing.
· Any partitioning columns. This provides cardinality estimates to the optimizer when the partitioning column is part of a query’s selection criteria
For a partitioned primary index table, consider collecting these statistics if the partitioning column is not part of the table’s primary index (PI):
· (PARTITION, PI). This statistic is most important when a given PI value may exist in multiple partitions, and can be skipped if a PI value only goes to one partition. It provides the optimizer with the distribution of primary index values across the partitions. It helps in costing the sliding-window and rowkey-based merge join, as well as dynamic partition elimination.
· (PARTITION, PI, partitioning column). This statistic provides the combined number of distinct values for the combination of PI and partitioning columns after partition elimination. It is used in rowkey join costing.