Querying your database

See Creating your database on how to create a database that can be queried.

Combine queries to select specific rows from the database and utilize them in the conversion planning process.

Queries support chaining, allowing users to refine selections by applying multiple filter criteria sequentially. This chaining mechanism enables precise and flexible data selection.

from dicomselect import Database

# Initialize an existing .db
db = Database(existing_db_path)
with db as query:
    query_0000 = query.where('patient_id', '=', 'ProstateX-0000').where('image_direction', '=', 'transverse')
    print(query_0000.include('image_direction', 'rows', 'columns', 'flip_angle'))

The print gives:

Total selected DICOMs: 5
========================
image_direction
    (5) transverse
rows
    (4) 128
    (1) 384
columns
    (3) 84
    (1) 384
    (1) 128
flip_angle
    (3) 90.0
    (1) 160.0
    (1) 14.0

Query API

class dicomselect.query.Query(*args)

Bases: object

Combine queries (a selection of rows from the database) with Database.plan() to plan out a conversion of your selection.

Examples

>>> db = Database(db_path)
>>> with db as query:
>>>     query_0000 = query.where('patient_id', '=', 'ProstateX-0000').where('image_direction', '=', 'transverse')
>>> db.plan(template_str, query_0000)
property columns

Return a tuple containing the names of all the columns in the database.

Returns:

A tuple of column names.

property count: int
difference(where: Query) Query

Create a new query by taking the difference of the results of the specified queries.

Parameters:

where (Query) – The query to subtract.

Raises:

ValueError – If any of the specified queries do not exist.

Return type:

Query

distinct_values(column: str) List[str]

Retrieve distinct values from a specified column.

Parameters:

column (str) – The name of the column to retrieve distinct values from.

Return type:

List[str]

exclude(*columns, recommended: bool = True, exclude_all_distinct: bool = False, exclude_none_distinct: bool = False) Info

Create an Info object, which excludes the provided columns for printouts.

Parameters:
  • columns – Columns to filter down to.

  • recommended (bool) – Exclude columns that the author considers not useful in most cases. (most UID values, physical positions)

  • exclude_all_distinct (bool) – Exclude columns where every value is unique.

  • exclude_none_distinct (bool) – Exclude columns where all values are the same.

Return type:

Info

include(*columns) Info

Create an Info object, which filters down to the provided columns for printouts.

Parameters:

columns – Columns to filter down to.

Return type:

Info

property info

Returns an Info object which can print out the current query selection.

Info inherits the following functions: count(), to_string(), to_df(), include() and exclude().

intersect(where: Query) Query

Create a new view by intersecting the results of the specified queries.

Parameters:

where (Query) – The query to intersect. Leave empty to intersect using the last two queries.

Return type:

Query

Raises
ValueError

If any of the specified queries do not exist.

property is_base: bool

Whether this query is the base query obtained from the parent Database.

to_df() DataFrame

The current results of a query as a pandas DataFrame. This function is untested.

Return type:

DataFrame

to_string(sort_by_homogeneous: bool = True) str

The current results of a query as a str.

Parameters:

sort_by_homogeneous (bool) – Sort by homogeneous columns. (default = True)

Return type:

str

union(where: Query) Query

Create a new query by taking the union of the results of the specified queries.

Parameters:

where (Query) – The query to union. Leave empty to union using the last two queries.

Raises:

ValueError – If any of the specified queries do not exist.

Return type:

Query

where(column: str, operator: str, values: List[str] | str, invert: bool = False) Query

Filter the dataset based on the given column, operator, and values. The result can be combined with other queries using the union(), difference(), and intersect() methods.

Parameters:
  • column (str) – Name of the column to query. The name is case-sensitive. The columns property can be used to obtain a list of all available columns.

  • operator (str) – Valid operators include ‘=’, ‘<>’, ‘!=’, ‘>’, ‘>=’, ‘<’, ‘<=’, ‘like’, ‘between’, and ‘in’.

  • values (List[str] | str) – Values to query. Providing more values than expected will create OR chains, eg. (column=’a’) OR (column=’b’), where appropriate.

  • invert (bool) – Invert the query, by prefixing the query with a NOT.

Return type:

Query

where_raw(sql: str) Query

Create a query based on a raw SQL query. Not recommended.

Parameters:

sql (str) – SQL query. “… WHERE” is prefixed.

Raises:

ValueError – Invalid SQL.

Return type:

Query