Filters: Using .where()
To add here:
- where between dates
- grep
Examples for the .where() method.
Most of these examples are where I am creating a new table with a subselection of rows from the base table:
newtable = oldtable.where(lambda row: row['Column_Name'] == 'something')
Lamda is a function that happens just once in place. In the example above, I'm looking through each row, specifically at the column Column_Name
for the text 'something'. If it matches, I keep it and put it in the newtable. If not, it is discarded.
Where boolean field is True
Checking if a boolean field is true or false:
charters = campus.where(lambda row: row['CFLCHART'] == 1)
Note to self: Test if this will work as
== True
.
Where boolean field is not True
When you want fields that don't match your test.:
traditional = campus.where(lambda row: row['CFLCHART'] != 1)
In this case, we are testing 'CFLCHART' to make sure it does NOT equal 1.
Note to self: Test if this would catch nulls.)
By position in string
If you want to look at a specific position in a string and then check it:
charters = campus.where(lambda row: row['Campus_ID'][3] == '8')
The example is searching the Campus_ID
field looking at the fourth character, checking to see if it is the text '8'
.
Check for blank cells
In this example, I wanted to exclude rows that were null or blank in the Campus
column:
nodistrict = raw.where(lambda row: row['CAMPUS'] is not None)
Filter by date
In this case, I wanted only rows where the CentralTime
date was on or after August 25. The trick here was importing datetime
, and setting that reference as below. It wasn't intuitive for me from the docs.
import datetime
# filter for Aug. 25th or later
flow_harvey = flow_central.where(
lambda row: datetime.date(2017, 8, 25) <= row['CentralTime'].date()
)
Filter by list of items
In this case I only want rows that do NOT match the values in the list.
new_table = old_table.where(lambda row: row['Column name'] not in [
'ALTERNATIVE',
'PRIMARY'
])
- You could change
not in
to justin
to filter the results for positive matches. - You could create a variable to hold the list and then pass that into the
where
statement.