where

DataFrame.where(index)[source]

Extract a subset of rows.

Creates a new View as a subselection of the current instance.

Args:
index (BooleanColumnView or FloatColumnView or FloatColumn):

Boolean column indicating the rows you want to select.

Example:

Generate example data:

data = dict(
    fruit=["banana", "apple", "cherry", "cherry", "melon", "pineapple"],
    price=[2.4, 3.0, 1.2, 1.4, 3.4, 3.4],
    join_key=["0", "1", "2", "2", "3", "3"])

fruits = getml.DataFrame.from_dict(data, name="fruits",
roles={"categorical": ["fruit"], "join_key": ["join_key"], "numerical": ["price"]})

fruits
| join_key | fruit       | price     |
| join key | categorical | numerical |
--------------------------------------
| 0        | banana      | 2.4       |
| 1        | apple       | 3         |
| 2        | cherry      | 1.2       |
| 2        | cherry      | 1.4       |
| 3        | melon       | 3.4       |
| 3        | pineapple   | 3.4       |

Apply where condition. This creates a new DataFrame called “cherries”:

cherries = fruits.where(
    fruits["fruit"] == "cherry")

cherries
| join_key | fruit       | price     |
| join key | categorical | numerical |
--------------------------------------
| 2        | cherry      | 1.2       |
| 2        | cherry      | 1.4       |