Distinct values
Sometimes you need to see distinct values in a column or filter a table based on distinct values.
Distinct values of a column
If you want to see the distinct values of a column, you can call the .values_distinct()
method on that column and see the results in a tuple.
rated.columns['C_RATING'].values_distinct()
rated
is the name of the table in this case, and C_RATING
is the column. The result looks like this:
('A', 'I', 'X', 'Z', 'T', 'M')
Filter a table based on distinct values
In this case I started with table (tec
) of campaign contributions over multiple years. Candidates (and their race) are listed more than once. I wanted to get the demographics of the group, so I had to first get a table that listed each candidate only once. I selected the two columns I want (Candidate and Race-Ethnicity) and then used the distinct
method using the Candidate
to get the new table.
)
candidates = tec.select(['Candidate', 'Race-Ethnicity']).distinct('Candidate')
Now the new candidates
table only lists each candidate once.
(NOTE: If a candidate was listed more than once with a different race, the table would keep only the first value. Not a problem with race, but could bite you in other cases. I did try to pass in ['Candidate', 'Race-Ethnicity']
but the result was NOT distinct.)