gpf.lookups module

This module can be used to build lookup data structures from Esri tables and feature classes.

lookups._process_row(row: Sequence, **kwargs) → Optional[str]

The default row processor function used by the Lookup class. Alternative row processor functions are implemented by the other lookup classes (e.g. ValueLookup).

Parameters:
  • lookup – A reference to the lookup dictionary. If the process_row() function is built in to a lookup class, lookup refers to self.
  • row – The current row tuple (as returned by a SearchCursor).
  • kwargs – Optional user-defined keyword arguments.
Return type:

None, str, unicode

Note

This “private” function is documented here, so that users can see its signature and behaviour. However, users should not call this function directly, but define their own functions based on this one, using the same function signature.

Row processor functions directly manipulate (i.e. populate) the dictionary. Typically, this function should at least add a key and value(s) to the lookup dictionary.

A row function should always return ``None``, unless the user wants to terminate the lookup. In that case, a failure reason (message) should be returned.

gpf.lookups.XYZ_RESOLUTION = 0.0001

The default (Esri-recommended) resolution that is used by the get_nodekey() function (i.e. for lookups). If coordinate values fall within this distance, they are considered equal. Set this to a higher or lower value (coordinate system units) if required.

gpf.lookups.get_nodekey(*args) → Tuple[int][source]

This function creates a hash-like tuple that can be used as a key in a RowLookup or ValueLookup dictionary. The tuple does not contain actual hashes, but consists of 2 or 3 (long) integers, which essentially are created by dividing the coordinate values by the default resolution (0.0001) and truncating them to an integer.

Whenever a lookup is created using SHAPE@XY or SHAPE@XYZ as the key_field, this function is automatically used to generate a key for the coordinate. If the user has a coordinate and wants to find the matching value(s) in the lookup, the coordinate must be turned into a key first using this function.

Note

The number of dimensions of the coordinate must match the ones in the lookup. In other words, when a lookup was built using 2D coordinates, the lookup key must be 2D as well.

Warning

This function has been tested on 10 million random points and no duplicate keys were encountered. However, bear in mind that 2 nearly identical coordinates might share the same key if they lie within the default resolution distance from each other (0.0001 units e.g. meters). If the default resolution needs to be changed, set the XYZ_RESOLUTION constant beforehand.

Example:

>>> coord_lookup = ValueLookup('C:/Temp/test.gdb/my_points', 'SHAPE@XY', 'GlobalID')
>>> coord = (4.2452, 23.24541)
>>> key = key(*coord)
>>> print(key)
(42451, 232454)
>>> coord_lookup.get(key)
'{628ee94d-2063-47be-b57f-8c2af6345d4e}'
Parameters:args – A minimum of 2 numeric values, an EsriJSON dictionary, an ArcPy Point or PointGeometry instance.
gpf.lookups.get_coordtuple(node_key: Tuple[int]) → Tuple[float][source]

This function converts a node key (created by get_nodekey()) of integer tuples back into a floating point coordinate X, Y(, Z) tuple.

Warning

This function should only be used to generate output for printing/logging purposes or to create approximate coordinates. Because get_nodekey() truncates the coordinate, it is impossible to get the same coordinate value back as the one that was used to create the node key, which means that some accuracy will be lost in the process.

Parameters:node_key – The node key (tuple of integers) that has to be converted.
Return type:tuple
class gpf.lookups.Lookup(table_path, key_field, value_field(s), {where_clause}, {**kwargs})[source]

Base class for all lookups.

This class can be instantiated directly, but typically, a user would create a custom lookup class based on this one and then override the Lookup._process_row() method. Please refer to other implementations (RowLookup, ValueLookup) for concrete examples.

Params:

  • table_path (str, unicode):

    Full source table or feature class path.

  • key_field (str, unicode):

    The field to use for the lookup dictionary keys. If SHAPE@X[Y[Z]] is used as the key field, the coordinates are “hashed” using the gpf.lookups.get_nodekey() function. This means, that the user should use this function as well in order to to create a coordinate key prior to looking up the matching value for it.

  • value_fields (list, tuple, str, unicode):

    The field or fields to include as the lookup dictionary value(s), i.e. row. This is the value (or tuple of values) that is returned when you perform a lookup by key.

  • where_clause (str, unicode, gpf.tools.queries.Where):

    An optional where clause to filter on.

Keyword params:

  • row_func:

    If the user wishes to call the standard Lookup class but simply wants to use a custom row processor function, you can pass in this function using the keyword row_func.

Raises:
  • RuntimeError – When the lookup cannot be created or populated.
  • ValueError – When a specified lookup field does not exist in the source table, or when multiple value fields were specified.
clear() → None. Remove all items from D.
copy() → a shallow copy of D
fromkeys()

Returns a new dict with keys from iterable and values equal to value.

get(k[, d]) → D[k] if k in D, else d. d defaults to None.
items() → a set-like object providing a view on D's items
keys() → a set-like object providing a view on D's keys
pop(k[, d]) → v, remove specified key and return the corresponding value.

If key is not found, d is returned if given, otherwise KeyError is raised

popitem() → (k, v), remove and return some (key, value) pair as a

2-tuple; but raise KeyError if D is empty.

setdefault(k[, d]) → D.get(k,d), also set D[k]=d if k not in D
update([E, ]**F) → None. Update D from dict/iterable E and F.

If E is present and has a .keys() method, then does: for k in E: D[k] = E[k] If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k]

values() → an object providing a view on D's values
class gpf.lookups.ValueLookup(table_path, key_field, value_field, {where_clause}, {duplicate_keys})[source]

Creates a lookup dictionary from a given source table or feature class. ValueLookup inherits from dict, so all the built-in dictionary functions (update(), items() etc.) are available.

When an empty key (None) is encountered, the key-value pair will be discarded.

Params:

  • table_path (str, unicode):

    Full source table or feature class path.

  • key_field (str, unicode):

    The field to use for the ValueLookup dictionary keys. If SHAPE@X[Y[Z]] is used as the key field, the coordinates are “hashed” using the gpf.lookups.get_nodekey() function. This means, that the user should use this function as well in order to to create a coordinate key prior to looking up the matching value for it.

  • value_field (str, unicode):

    The single field to include in the ValueLookup dictionary value. This is the value that is returned when you perform a lookup by key.

  • where_clause (str, unicode, gpf.tools.queries.Where):

    An optional where clause to filter the table.

Keyword params:

  • duplicate_keys (bool):

    If True, the ValueLookup allows for duplicate keys in the input. The dictionary values will become lists of values instead of a single value. Please note that actual duplicate checks will not be performed. This means, that when duplicate_keys is False and duplicates are encountered, the last existing key-value pair will be overwritten.

Raises:
  • RuntimeError – When the lookup cannot be created or populated.
  • ValueError – When a specified lookup field does not exist in the source table, or when multiple value fields were specified.

See also

When multiple fields should be stored in the lookup, the gpf.lookups.RowLookup class should be used instead.

clear() → None. Remove all items from D.
copy() → a shallow copy of D
fromkeys()

Returns a new dict with keys from iterable and values equal to value.

get(k[, d]) → D[k] if k in D, else d. d defaults to None.
items() → a set-like object providing a view on D's items
keys() → a set-like object providing a view on D's keys
pop(k[, d]) → v, remove specified key and return the corresponding value.

If key is not found, d is returned if given, otherwise KeyError is raised

popitem() → (k, v), remove and return some (key, value) pair as a

2-tuple; but raise KeyError if D is empty.

setdefault(k[, d]) → D.get(k,d), also set D[k]=d if k not in D
update([E, ]**F) → None. Update D from dict/iterable E and F.

If E is present and has a .keys() method, then does: for k in E: D[k] = E[k] If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k]

values() → an object providing a view on D's values
class gpf.lookups.RowLookup(table_path, key_field, value_fields, {where_clause}, {duplicate_keys}, {mutable_values})[source]

Creates a lookup dictionary from a given table or feature class. RowLookup inherits from dict, so all the built-in dictionary functions (update(), items() etc.) are available.

When an empty key (None) is encountered, the key-values pair will be discarded.

Params:

  • table_path (str, unicode):

    Full source table or feature class path.

  • key_field (str, unicode):

    The field to use for the RowLookup dictionary keys. If SHAPE@X[Y[Z]] is used as the key field, the coordinates are “hashed” using the gpf.tools.lookup.get_nodekey() function. This means, that the user should use this function as well in order to to create a coordinate key prior to looking up the matching values for it.

  • value_field (str, unicode):

    The fields to include in the RowLookup dictionary values. These are the values that are returned when you perform a lookup by key.

  • where_clause (str, unicode, gpf.tools.queries.Where):

    An optional where clause to filter the table.

Keyword params:

  • duplicate_keys (bool):

    If True, the RowLookup allows for duplicate keys in the input. The values will become lists of tuples/lists instead of a single tuple/list. Please note that duplicate checks will not actually be performed. This means, that when duplicate_keys is False and duplicates are encountered, the last existing key-value pair will be simply overwritten.

  • mutable_values (bool):

    If True, the RowLookup values are stored as list objects. These are mutable, which means that you can change the values or add new ones. The default is False, which causes the RowLookup values to become tuple objects. These are immutable, which consumes less memory and allows for faster retrieval.

Raises:
  • RuntimeError – When the lookup cannot be created or populated.
  • ValueError – When a specified lookup field does not exist in the source table, or when a single value field was specified.

See also

When a single field value should be stored in the lookup, the gpf.lookups.ValueLookup class should be used instead.

get_value(key, field, default=None)[source]

Looks up a value by key for one specific field. This function can be convenient when only a single value needs to be retrieved from the lookup. The difference with the built-in get() method is, that the get_value() function returns a single value, whereas the other one returns a list or tuple of values (i.e. row).

Example:

>>> my_lookup = RowLookup('C:/Temp/test.gdb/my_table', 'GlobalID', 'Field1', 'Field2')
>>> # Traditional approach to print Field1:
>>> values = my_lookup.get('{628ee94d-2063-47be-b57f-8c2af6345d4e}')
>>> if values:
>>>     print(values[0])
'ThisIsTheValueOfField1'
>>> # Alternative traditional approach to print Field1:
>>> field1, field2 = my_lookup.get('{628ee94d-2063-47be-b57f-8c2af6345d4e}', (None, None))
>>> if field1:
>>>     print(field1)
'ThisIsTheValueOfField1'
>>> # Approach using the get_value() function:
>>> print(my_lookup.get_value('{628ee94d-2063-47be-b57f-8c2af6345d4e}', 'Field1'))
'ThisIsTheValueOfField1'
Parameters:
  • key – Key to find in the lookup dictionary.
  • field – The field name (as used during initialization of the lookup) for which to retrieve the value.
  • default – The value to return when the value was not found. Defaults to None.
clear() → None. Remove all items from D.
copy() → a shallow copy of D
fromkeys()

Returns a new dict with keys from iterable and values equal to value.

get(k[, d]) → D[k] if k in D, else d. d defaults to None.
items() → a set-like object providing a view on D's items
keys() → a set-like object providing a view on D's keys
pop(k[, d]) → v, remove specified key and return the corresponding value.

If key is not found, d is returned if given, otherwise KeyError is raised

popitem() → (k, v), remove and return some (key, value) pair as a

2-tuple; but raise KeyError if D is empty.

setdefault(k[, d]) → D.get(k,d), also set D[k]=d if k not in D
update([E, ]**F) → None. Update D from dict/iterable E and F.

If E is present and has a .keys() method, then does: for k in E: D[k] = E[k] If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k]

values() → an object providing a view on D's values
class gpf.lookups.NodeSet(fc_path: str, where_clause: Union[None, str, gpf.tools.queries.Where] = None, all_vertices: bool = False)[source]

Builds a set of unique node keys for coordinates in a feature class. The get_nodekey() function will be used to generate the coordinate hash. When the feature class is Z aware, the node keys will be 3D as well. Note that in all cases, M will be ignored.

The NodeSet inherits all methods from the built-in Python set.

For feature classes with a geometry type other than Point, a NodeSet will be built from the first and last points in a geometry. If this is not desired (i.e. all coordinates should be included), the user should set the all_vertices option to True. An exception to this behavior is the Multipoint geometry: for this type, all coordinates will always be included.

Params:

  • fc_path (str):

    The full path to the feature class.

  • where_clause (str, unicode, gpf.tools.queries.Where):

    An optional where clause to filter the feature class.

  • all_vertices (bool):

    Defaults to False. When set to True, all geometry coordinates are included. Otherwise, only the first and/or last points are considered.

Raises:ValueError – If the input dataset is not a feature class or if the geometry type is MultiPatch.
add()

Add an element to a set.

This has no effect if the element is already present.

clear()

Remove all elements from this set.

copy()

Return a shallow copy of a set.

difference()

Return the difference of two or more sets as a new set.

(i.e. all elements that are in this set but not the others.)

difference_update()

Remove all elements of another set from this set.

discard()

Remove an element from a set if it is a member.

If the element is not a member, do nothing.

intersection()

Return the intersection of two sets as a new set.

(i.e. all elements that are in both sets.)

intersection_update()

Update a set with the intersection of itself and another.

isdisjoint()

Return True if two sets have a null intersection.

issubset()

Report whether another set contains this set.

issuperset()

Report whether this set contains another set.

pop()

Remove and return an arbitrary set element. Raises KeyError if the set is empty.

remove()

Remove an element from a set; it must be a member.

If the element is not a member, raise a KeyError.

symmetric_difference()

Return the symmetric difference of two sets as a new set.

(i.e. all elements that are in exactly one of the sets.)

symmetric_difference_update()

Update a set with the symmetric difference of itself and another.

union()

Return the union of sets as a new set.

(i.e. all elements that are in either set.)

update()

Update a set with the union of itself and others.

class gpf.lookups.ValueSet(table_path, field, where_clause=None)[source]

Builds a set of unique values for a single column in a feature class or table. This class inherits all methods from the built-in Python frozenset.

Params:

  • table_path (str):

    The full path to the table or feature class.

  • field (str):

    The field name for which to collect a set of unique values.

  • where_clause (str, gpf.tools.queries.Where):

    An optional where clause to filter the feature class.

copy()

Return a shallow copy of a set.

difference()

Return the difference of two or more sets as a new set.

(i.e. all elements that are in this set but not the others.)

intersection()

Return the intersection of two sets as a new set.

(i.e. all elements that are in both sets.)

isdisjoint()

Return True if two sets have a null intersection.

issubset()

Report whether another set contains this set.

issuperset()

Report whether this set contains another set.

symmetric_difference()

Return the symmetric difference of two sets as a new set.

(i.e. all elements that are in exactly one of the sets.)

union()

Return the union of sets as a new set.

(i.e. all elements that are in either set.)