Data Types - An Introduction to agtype

AGE uses a custom data type called agtype, which is the only data type returned by AGE. Agtype is a superset of Json and a custom implementation of JsonB.

Simple Data Types

Null

In Cypher, null is used to represent missing or undefined values. Conceptually, null means ‘a missing unknown value’ and it is treated somewhat differently from other values. For example getting a property from a vertex that does not have said property produces null. Most expressions that take null as input will produce null. This includes boolean expressions that are used as predicates in the WHERE clause. In this case, anything that is not true is interpreted as being false. null is not equal to null. Not knowing two values does not imply that they are the same value. So the expression null = null yields null and not true.

Input/Output Format

Query

SELECT *
FROM cypher('graph_name', $$
    RETURN NULL
$$) AS (null_result agtype);

A null will appear as an empty space.

Result:

null_result

(1 row)

Agtype NULL vs Postgres NULL

The concept of NULL in Agtype and Postgres is the same as it is in Cypher.

Integer

The integer type stores whole numbers, i.e. numbers without fractional components. Integer data type is a 64-bit field that stores values from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807. Attempts to store values outside this range will result in an error.

The type integer is the common choice, as it offers the best balance between range, storage size, and performance. The smallint type is generally used only if disk space is at a premium. The bigint type is designed to be used when the range of the integer type is insufficient.

Input/Output Format

Query

SELECT *
FROM cypher('graph_name', $$
    RETURN 1
$$) AS (int_result agtype);

Result:

int_result
1
(1 row)

Float

The data type float is an inexact, variable-precision numeric type, conforming to the IEEE-754 Standard.

Inexact means that some values cannot be converted exactly to the internal format and are stored as approximations, so that storing and retrieving a value might show slight discrepancies. Managing these errors and how they propagate through calculations is the subject of an entire branch of mathematics and computer science and will not be discussed here, except for the following points:

  • If you require exact storage and calculations (such as for monetary amounts), use the numeric type instead.

  • If you want to do complicated calculations with these types for anything important, especially if you rely on certain behavior in boundary cases (infinity, underflow), you should evaluate the implementation carefully.

  • Comparing two floating-point values for equality might not always work as expected.

Values that are too large or too small will cause an error. Rounding might take place if the precision of an input number is too high. Numbers too close to zero that are not representable as distinct from zero will cause an underflow error.

In addition to ordinary numeric values, the floating-point types have several special values:

  • Infinity

  • -Infinity

  • NaN

These represent the IEEE 754 special values “infinity”, “negative infinity”, and “not-a-number”, respectively. When writing these values as constants in a Cypher command, you must put quotes around them and typecast them, for example

SET x.float_value = '-Infinity'::float

On input, these strings are recognized in a case-insensitive manner.

Note IEEE754 specifies that NaN should not compare equal to any other floating-point value (including NaN). However, in order to allow floats to be sorted correctly, AGE evaluates ‘NaN’::float = ‘NaN’::float to true. See the section Comparability and Equality for more details.

Input/Output Format:

To use a float, denote a decimal value.

Query

SELECT *
FROM cypher('graph_name', $$
    RETURN 1.0
$$) AS (float_result agtype);

Result:

float_result
1.0
(1 row)

Numeric

The type numeric can store numbers with a very large number of digits. It is especially recommended for storing monetary amounts and other quantities where exactness is required. Calculations with numeric values yield exact results where possible, e.g., addition, subtraction, multiplication. However, calculations on numeric values are very slow compared to the integer types, or to the floating-point type.

We use the following terms below: The precision of a numeric is the total count of significant digits in the whole number, that is, the number of digits to both sides of the decimal point. The scale of a numeric is the count of decimal digits in the fractional part, to the right of the decimal point. So the number 23.5141 has a precision of 6 and a scale of 4. Integers can be considered to have a scale of zero.

Without any precision or scale creates a column in which numeric values of any precision and scale can be stored, up to the implementation limit on precision. A column of this kind will not coerce input values to any particular scale, whereas numeric columns with a declared scale will coerce input values to that scale. (The SQL standard requires a default scale of 0, i.e., coercion to integer precision. We find this a bit useless. If you’re concerned about portability, always specify the precision and scale explicitly.)

_Note
The maximum allowed precision when explicitly specified in the type declaration is 1000; NUMERIC without a specified precision is subject to the limits described in Table 8.2._

If the scale of a value to be stored is greater than the declared scale of the column, the system will round the value to the specified number of fractional digits. Then, if the number of digits to the left of the decimal point exceeds the declared precision minus the declared scale, an error is raised.

Numeric values are physically stored without any extra leading or trailing zeroes. Thus, the declared precision and scale of a column are maximums, not fixed allocations. (In this sense the numeric type is more akin to varchar(n) than to char(n).) The actual storage requirement is two bytes for each group of four decimal digits, plus three to eight bytes overhead.

In addition to ordinary numeric values, the numeric type allows the special value NaN, meaning “not-a-number”. Any operation on NaN yields another NaN. When writing this value as a constant in an SQL command, you must put quotes around it, for example UPDATE table SET x = ‘NaN’.

_Note
In most implementations of the "not-a-number" concept, NaN is considered not equal to any other numeric value (including NaN). However, in order to allow floats to be sorted correctly, AGE evaluates 'NaN'::numeric = 'NaN':numeric to true. See the section Comparability and Equality for more details._

When rounding values, the numeric type rounds ties away from zero, while (on most machines) the real and double precision types round ties to the nearest even number. For example:

Input/Output Format:

When creating a numeric data type, the ::numeric data annotation is required.

Query

SELECT *
FROM cypher('graph_name', $$
    RETURN 1.0::numeric
$$) AS (numeric_result agtype);

Result:

numeric_result
1.0::numeric
(1 row)

Bool

AGE provides the standard Cypher type boolean. The boolean type can have several states: “true”, “false”, and a third state, “unknown”, which is represented by the Agtype null value.

Boolean constants can be represented in Cypher queries by the keywords TRUE, FALSE, and NULL.

Input/Output Format

Query

SELECT *
FROM cypher('graph_name', $$
    RETURN TRUE
$$) AS (boolean_result agtype);

Unlike Postgres, AGE’s boolean outputs as the full word, ie. true and false as opposed to t and f.

Result:

boolean_result
true
(1 row)

String

Agtype strings String literals can contain the following escape sequences:

Escape Sequence Character
\t Tab
\b Backspace
\n Newline
\r Carriage Return
\f Form Feed
\’ Single Quote
\” Double Quote
\\ Backslash
\uXXXX Unicode UTF-16 code point (4 hex digits must follow the \u)

Input/Output Format

Use single (‘) quotes to identify a string. The output will use double (“) quotes.

Query

SELECT *
FROM cypher('graph_name', $$
    RETURN 'This is a string'
$$) AS (string_result agtype);

Result:

string_result
“This is a string”
(1 row)

Composite Data Types

List

All examples will use the WITH clause and RETURN clause.

Lists in general

A literal list is created by using brackets and separating the elements in the list with commas.

Query

SELECT *
FROM cypher('graph_name', $$
    WITH [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] as lst
    RETURN lst
$$) AS (lst agtype);

Result:

lst
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
(1 row)

NULL in a List

A list can hold the value null, unlike when a null is an independent value, it will appear as the word ‘null’ in a list

Query

SELECT *
FROM cypher('graph_name', $$
    WITH [null] as lst
    RETURN lst
$$) AS (lst agtype);

Result:

lst
[null]
(1 row)

Access Individual Elements

To access individual elements in the list, we use the square brackets again. This will extract from the start index and up to but not including the end index.

Query

SELECT *
FROM cypher('graph_name', $$
    WITH [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] as lst
    RETURN lst[3]
$$) AS (element agtype);

Result:

element
3
(1 row)

Map Elements in Lists

Query

SELECT *
FROM cypher('graph_name', $$
   WITH [0, {key: 'key_value'}, 2, 3, 4, 5, 6, 7, 8, 9, 10] as lst
    RETURN lst
$$) AS (map_value agtype);

Result:

map_value
[0, {"key": "key_value"}, 2, 3, 4, 5, 6, 7, 8, 9, 10]
(1 row)

Accessing Map Elements in Lists

Query

SELECT *
FROM cypher('graph_name', $$
   WITH [0, {key: 'key_value'}, 2, 3, 4, 5, 6, 7, 8, 9, 10] as lst
    RETURN lst[1].key
$$) AS (map_value agtype);

Result:

map_value
“key_value”
(1 row)

Negative Index Access

You can also use negative numbers, to start from the end of the list instead.

Query

SELECT *
FROM cypher('graph_name', $$
    WITH [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] as lst
    RETURN lst[-3]
$$) AS (element agtype);

Result:

element
8
(1 row)

Index Ranges

Finally, you can use ranges inside the brackets to return ranges of the list.

Query

SELECT *
FROM cypher('graph_name', $$
    WITH [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] as lst
    RETURN lst[0..3]
$$) AS (element agtype);

Result:

element
[0, 1, 2]
(1 row)

Negative Index Ranges

Query

SELECT *
FROM cypher('graph_name', $$
    WITH [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] as lst
    RETURN lst[0..-5]
$$) AS (lst agtype);

Result:

lst
[0, 1, 2, 3, 4, 5]
(1 row)

Positive Slices

Query

SELECT *
FROM cypher('graph_name', $$
    WITH [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] as lst
    RETURN lst[..4]
$$) AS (lst agtype);

Result:

lst
[0, 1, 2, 3]
(1 row)

Negative Slices

Query

SELECT *
FROM cypher('graph_name', $$
    WITH [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] as lst
    RETURN lst[-5..]
$$) AS (lst agtype);

Result:

lst
[6, 7, 8, 9, 10]
(1 row)

Out-of-bound slices are simply truncated, but out-of-bound single elements return null.

Query

SELECT *
FROM cypher('graph_name', $$
    WITH [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] as lst
    RETURN lst[15]
$$) AS (element agtype);

Result:

element
(1 row)

Query

SELECT *
FROM cypher('graph_name', $$
    WITH [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] as lst
    RETURN lst[5..15]
$$) AS (element agtype);

Result:

element
[5, 6, 7, 8, 9, 10]
(1 row)

Map

Maps can be constructed using Cypher.

Literal Maps with Simple Data Types

You can construct a simple map with simple agtypes

Query

SELECT *
FROM cypher('graph_name', $$
    WITH {int_key: 1, float_key: 1.0, numeric_key: 1::numeric, bool_key: true, string_key: 'Value'} as m
    RETURN m
$$) AS (m agtype);

Result:

m
{"int_key": 1, "bool_key": true, "float_key": 1.0, "string_key": "Value", "numeric_key": 1::numeric}
(1 row)

Literal Maps with Composite Data Types

A map can also contain Composite Data Types, i.e. lists and other maps.

Query

SELECT *
FROM cypher('graph_name', $$
    WITH {listKey: [{inner: 'Map1'}, {inner: 'Map2'}], mapKey: {i: 0}} as m
    RETURN m
$$) AS (m agtype);

Result:

m
{"mapKey": {"i": 0}, "listKey": [{"inner": "Map1"}, {"inner": "Map2"}]}
(1 row)

Property Access of a map

Query

SELECT *
FROM cypher('graph_name', $$
    WITH {int_key: 1, float_key: 1.0, numeric_key: 1::numeric, bool_key: true, string_key: 'Value'} as m
    RETURN m.int_key
$$) AS (int_key agtype);

Result:

int_key
1
(1 row)

Accessing List Elements in Maps

Query

SELECT *
FROM cypher('graph_name', $$
    WITH {listKey: [{inner: 'Map1'}, {inner: 'Map2'}], mapKey: {i: 0}} as m
    RETURN m.listKey[0]
$$) AS (m agtype);

Result:

m
{"inner": "Map1"}
(1 row)

Simple Entities

An entity has a unique, comparable identity which defines whether or not two entities are equal.

An entity is assigned a set of properties, each of which are uniquely identified in the set by the irrespective property keys.

GraphId

Simple entities are assigned a unique graphid. A graphid is a unique composition of the entity’s label id and a unique sequence assigned to each label. Note that there will be overlap in ids when comparing entities from different graphs.

Labels

A label is an identifier that classifies vertices and edges into certain categories.

  • Edges are required to have a label, but vertices do not.

  • The names of labels between vertices and edges cannot overlap.

See CREATE clause for information about how to make entities with labels.

Properties

Both vertices and edges may have properties. Properties are attribute values, and each attribute name should be defined only as a string type.

Vertex

  • A vertex is the basic entity of the graph, with the unique attribute of being able to exist in and ofitself.

  • A vertex may be assigned a label.

  • A vertex may have zero or more outgoing edges.

  • A vertex may have zero or more incoming edges.

Data Format:

Attribute Name Description
Id graphid for this vertex
label Name of the label this vertex has
properties Properties associated with this vertex
{id:1; label: 'label_name'; properties: {prop1: value1, prop2: value2}}::vertex

Type Casting a Map to a Vertex

Query

SELECT *
FROM cypher('graph_name', $$
	WITH {id: 0, label: "label_name", properties: {i: 0}}::vertex as v
	RETURN v
$$) AS (v agtype);

Result:

v
{"id": 0, "label": "label_name", "properties": {"i": 0}}::vertex
(1 row)

Edge

An edge is an entity that encodes a directed connection between exactly two nodes, the source node and the target node. An outgoing edge is a directed relationship from the point of view of its source node. An incoming edge is a directed relationship from the point of view of its target node. An edge is assigned exactly one edge type.

Data Format

Attribute Name Description
id graphid for this edge
startid graphid for the source node
endid graphid for the target node
label Name of the label this edge has
properties Properties associated with this edge

Output:

{id: 3; startid: 1; endid: 2; label: 'edge_label' properties{prop1: value1, prop2: value2}}::edge

Type Casting a Map to an Edge

Query

SELECT *
FROM cypher('graph_name', $$
	WITH {id: 2, start_id: 0, end_id: 1, label: "label_name", properties: {i: 0}}::edge as e
	RETURN e
$$) AS (e agtype);

Result:

v
{"id": 2, "label": "label_name", "end_id": 1, "start_id": 0, "properties": {"i": 0}}::edge
(1 row)

Composite Entities

Path

A path is a series of alternating vertices and edges. A path must start with a vertex, and have at least one edge.

Type Casting a List to a Path

Query

SELECT *
FROM cypher('graph_name', $$
	WITH [{id: 0, label: "label_name_1", properties: {i: 0}}::vertex,
            {id: 2, start_id: 0, end_id: 1, label: "edge_label", properties: {i: 0}}::edge,
           {id: 1, label: "label_name_2", properties: {}}::vertex
           ]::path as p
	RETURN p
$$) AS (p agtype);

The result is formatted to improve readability

Result:

p
[{"id": 0, "label": "label_name_1", "properties": {"i": 0}}::vertex, {"id": 2, "label": "edge_label", "end_id": 1, "start_id": 0, "properties": {"i": 0}}::edge,
{"id": 1, "label": "label_name_2", "properties": {}}::vertex]::path
(1 row)