Welcome to the Datanyx Community!

Get Help, Learn Best Practices, Network With Other Datanyx Users

Notifications

Clear all

How does the Duplicate node functionality work?

Workflow

Last Post by ankit.pradhan 3 years ago

2 Posts

2 Users

0 Likes

1,831 Views

RSS

darshini b

(@darshini)

Active Member

Joined: 3 years ago

Posts: 4

Topic starter 29/09/2023 10:45 am

I need to drop duplicates in a dataset. Can I use the duplicate node to achieve this? How does it work?

Quote

Topic Tags

ankit.pradhan

(@ankit-pradhan)

Eminent Member

Joined: 4 years ago

Posts: 20

29/09/2023 11:02 am

Duplicate node helps in dropping the duplicates of the dataset based on the selected keys, sorting order and filter criteria.

Step 1: Once the data has been imported, click on data preparation.

Step 2: Drag and drop the “Duplicates” node onto the main screen. Connect the two nodes (Data Import node and Duplicates node) or three nodes (Two Data Import nodes and Duplicates node).

Step 3: Once this has been done, select the Unique Key or Multiple keys. Based on this, the duplicates are dropped from the dataset.

Step 4: It is to be noted that the default value of the unique key will always be set to “Select All”. You can click on the select button next to unique key to deselect the columns or select specific columns.

Step 5: You can also sort the values by selecting one key from the unique key select option and can select the sorting order from the Partition and Ranking option. If you do not want to sort the values, then you can unselect the Partition and Ranking option.

Step 6: Based on the requirement, you can also select the filter table option and enter the required filter criteria.

Step 7: The duplicates will be dropped based on the filtered data and the selected key.

ReplyQuote

Share:

44 Forums
352 Topics
375 Posts
2 Online
10 Members

Forum Icons: Forum contains no unread posts Forum contains unread posts

Topic Icons: Not Replied Replied Active Hot Sticky Unapproved Solved Private Closed