Query Builder API
- class arcticdb.QueryBuilder[source]
Build a query to process read results with. Syntax is designed to be similar to Pandas:
>>> q = QueryBuilder() >>> q = q[q["a"] < 5] (equivalent to q = q[q.a < 5] provided the column name is also a valid Python variable name) >>> dataframe = lib.read(symbol, query_builder=q).data
QueryBuilder objects are stateful, and so should not be reused without reinitialising:
>>> q = QueryBuilder()
For Group By and Aggregation functionality please see the documentation for the groupby. For projection functionality, see the documentation for the apply method.
Supported numeric operations when filtering:
Binary comparisons: <, <=, >, >=, ==, !=
Unary NOT: ~
Binary arithmetic: +, -, *, /
Unary arithmetic: -, abs
Binary combinators: &, |, ^
List membership: isin, isnotin (also accessible with == and !=)
isin/isnotin accept lists, sets, frozensets, 1D ndarrays, or *args unpacking. For example:
>>> l = [1, 2, 3] >>> q.isin(l)
is equivalent to…
>>> q.isin(1, 2, 3)
Boolean columns can be filtered on directly:
>>> q = QueryBuilder() >>> q = q[q["boolean_column"]]
and combined with other operations intuitively:
>>> q = QueryBuilder() >>> q = q[(q["boolean_column_1"] & ~q["boolean_column_2"]) & (q["numeric_column"] > 0)]
Arbitrary combinations of these expressions is possible, for example:
>>> q = q[(((q["a"] * q["b"]) / 5) < (0.7 * q["c"])) & (q["b"] != 12)]
See tests/unit/arcticdb/version_store/test_filtering.py for more example uses.
- Timestamp filtering:
pandas.Timestamp, datetime.datetime, pandas.Timedelta, and datetime.timedelta objects are supported. Note that internally all of these types are converted to nanoseconds (since epoch in the Timestamp/datetime cases). This means that nonsensical operations such as multiplying two times together are permitted (but not encouraged).
- Restrictions:
String equality/inequality (and isin/isnotin) is supported for printable ASCII characters only. Although not prohibited, it is not recommended to use ==, !=, isin, or isnotin with floating point values.
- Exceptions:
inf or -inf values are provided for comparison Column involved in query is a Categorical Symbol is pickled Column involved in query is not present in symbol Query involves comparing strings using <, <=, >, or >= operators Query involves comparing a string to one or more numeric values, or vice versa Query involves arithmetic with a column containing strings