Panel | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||
PLEASE NOTE: This documentation applies to Pentaho 8.0 and an earlier version. For Pentaho 8.1 and later, see Unique Rows (HashSet) on the most recent documentation, visit the Pentaho Enterprise Edition documentation site. |
Description
The Unique Rows (HashSet) transformation step tracks exact duplicate rows. The step can also remove duplicate rows and leave only unique occurrences. Unlike the Unique Rows transformation step, which only correctly evaluates consecutive duplicate rows unless used with a sorted input, the Unique Rows (HashSet) step does not require a sorted input to process duplicate rows, instead it tracks duplicates in memory.
...