Skip to content

Standing Queries

API v2 Coming Soon

API v2 introduces improvements to standing query operations. The current release uses API v1. See Migrating from API v1 for a preview of what's changing.

A standing query is a feature unique in Quine to incrementally MATCH a subgraph as new data enters the graph. Then, when a full subgraph matches, the result is further processed to modify the graph, produce output as a data source, or take additional action.

Standing Query Structure

Standing queries have two parts: a pattern query and outputs destinations. The pattern query defines the structure of what we're looking for, and output destinations specify actions for each result produced.

Consider the pattern query as a coarse filter that is specific enough to MATCH a model event but not so specific that it would match a unique event. A query in an output destination can process the result further with a more expressive Cypher query to then update the graph, act as an event source, generate metrics, and much more.

Cypher within a pattern query must only contain a MATCH and a RETURN, with an optional WHERE, while Cypher within output destinations is unconstrained, allowing for more expressive queries.

Standing Query Model

As described in the Create Standing Query: POST /api/v2/standing-queries API documentation, a standing query is created via POST to the /api/v2/standing-queries endpoint.

POST /api/v2/standing-queries
{
  "name": "STANDING-1",
  "pattern": {
    "type": "Cypher",
    "query": "MATCH (n) RETURN DISTINCT id(n) AS id"
  },
  "outputs": [
    {
      "name": "enrich-and-print",
      "resultEnrichment": {
        "query": "MATCH (n) WHERE id(n) = $that.data.id RETURN n.line",
        "parallelism": 16
      },
      "destinations": [
        { "type": "StandardOut" }
      ]
    }
  ]
}

Or in YAML if you are writing a recipe.

standingQueries:
  - name: STANDING-1
    pattern:
      type: Cypher
      query: MATCH (n) RETURN DISTINCT id(n) AS id
    outputs:
      - name: enrich-and-print
        resultEnrichment:
          query: >-
            MATCH (n)
            WHERE id(n) = $that.data.id
            RETURN n.line
          parallelism: 16
        destinations:
          - type: StandardOut

This structure ensures that the set of positive matches, minus the set of negative matches (also referred to as cancellations) produced by a standing query are like the results produced if the same Cypher query had been issued in a batch fashion after all data has been written into the graph.

Pattern Match Query

The pattern query in a standing query is a declarative graph pattern expressed using a subset of the Cypher query language containing a MATCH and a RETURN, with an optional WHERE.

Distinct ID Pattern Queries

Quine has two modes available for writing pattern queries, DistinctId, and MultipleValues. The mode is set to DistinctId by default unless you explicitly set the mode to MultipleValues within your pattern query. Quine can process standing queries that contain either mode within the same runtime or recipe.

The following constraints apply to Cypher contained in the pattern query string when mode is set to the default DistinctId:

  • Each node identified by the MATCH shall have the following:
  • Node variable name
  • Label (optional but not more than one)
  • Optional map of literal property values to match
  • Nodes in the MATCH must form a connected graph.
  • Nodes in the MATCH must not contain any cycles. In other words, the pattern must be either linear or tree-shaped.
  • Only node variables can be bound in the query MATCH. Edges cannot be aliased to a variable, and path expressions cannot be used (so -[:HAS_FATHER]-> is fine, but -[e:HAS_FATHER]-> is not).
  • Edges in the MATCH must be directed, have exactly one edge label, and cannot be variable-length.
  • Constraints inside the WHERE clause must be AND-ed together and of one of the following forms:

    • `nodeName.property = 1* - the property has the literal value on the right
    • `nodeName.property <> 1* - the property must exist but be different than the literal value on the right
    • nodeName.property IS NOT NULL - the property must exist
    • nodeName.property IS NULL - the property must not exist
    • nodeName.property =~ "regex" - the property must be a string matching the regex
    • `id(nodeName) = 12* - the ID of the node must be exactly the literal value on the right
    • id(nodeName) = idFrom('values', 'to', 'hash') - the ID of the node must match exactly the idFrom() computed from the literal values on the right
  • Exactly one value must be returned, and it must be either the DISTINCT id or strId of a node bound in the MATCH.

For example, RETURN DISTINCT strId(n) or RETURN DISTINCT id(n) as nId are OK, but not RETURN n.name or RETURN id(n) AS nId. The node whose id is returned is the root node - the location in the graph from which the pattern starts being incrementally matched.

MultipleValue Pattern Queries (Beta)

Beta Feature

This feature is in the beta phase of development.

Pattern queries with MultipleValues mode could:

  • Require syntax changes when the feature releases as GA
  • Use more RAM
  • Consume more disk space
  • Differ in performance from Distinct ID pattern query queries

MultipleValue mode pattern query queries relax some of the constraints imposed by DistinctId. In particular, the WHERE and RETURN portions of the query allow Cypher expressions to be much more expressive.

The syntax and structure of this mode is designed to supersede the DistinctId mode. Thus, any DistinctId standing query pattern is a valid MultipleValues standing query, though not the other way around.

The MultipleValues mode retains the MATCH - WHERE - RETURN shape from DistinctId mode with the addition of the constraints below.

  • Any number of results (not just one) can be returned in the RETURN, including results that aren't node IDs
  • Constraints in the WHERE clause are reduced
  • DISTINCT is required for DistinctId standing queries, but the MultipleValues mode does not support DISTINCT return values.
  • Use of variables must represent a node
  • Variable usage is ok when dereferencing node properties. For example RETURN n.name is ok but RETURN n is not.
  • WHERE and RETURN support id(n) and strId(n) but other functions are not supported.

The MATCH portion of standing queries using the MultipleValues mode removes the syntactic requirements for running in DistinctId mode with two exceptions:

  • Multiple IDs and property values from matched nodes can be returned by RETURN. For example, RETURN n.age + * strId(n) + " " + m.name is fine, but RETURN properties(n) is not.

  • Constraints in the WHERE clause must be defined in the IDs and properties of matched nodes and can not include sub-queries or procedures.

  • Can not MATCH variable length patterns
  • MATCH does not support pattern expressions

Since there isn't exactly one ID being returned, the root of the standing query pattern (the place in the pattern from which incremental matching starts) is instead set to be the first node in the MATCH pattern. This makes it possible to make any node in the pattern the "root".

Pattern Match Results

Both modes for the pattern query return a StandingQueryResult JSON object with meta and data sub-objects.

The meta JSON sub-object consists of the following:

  • isPositiveMatch: whether the result is a new match. When this value is false, it signifies that a previously matched result no longer matches

  • resultId: a UUID generated for each result. This is useful if you wish to track a result in some external system since the resultId of the result with isPositiveMatch = false will match the resultId of the original result (when isPositiveMatch = true).

The data JSON sub-object consists of the following:

  • On a positive match, the data JSON object contains results returned by the pattern query.
  • This objects keys are the names of the values returned (ex: RETURN DISTINCT strId(n) would have key "strId(n)" and RETURN DISTINCT id(n) AS theId would have key "theId").
  • Each query data returned is analogous to a row returned from a regular Cypher query - the key names match what would normally be Cypher column names.

When DistinctId mode is set, a result is emitted when a complete pattern matches or stops matching, but additional results won't be emitted if there are interim new complete pattern matches.

Example of single-result per root semantics

Consider the following query for watching friends.

// Find people with friends
MATCH (n:Person)-[:friend]->(m:Person)
RETURN DISTINCT strId(n)

If we start by creating disconnected "Peter", "John", and "James" nodes, there will be no matches.

CREATE (:Person { name: "Peter" }),
      (:Person { name: "John" }),
      (:Person { name: "James" })

Then, if we add a "friend" edge from "Peter" to "John", "Peter" will trigger a new standing query match.

MATCH (peter:Person { name: "Peter" }), (john:Person { name: "John" })
CREATE (peter)-[:friend]->(john)

However, adding a second "friend" edge from "Peter" to "James", "Peter" will not trigger a new match since he is already matching (that is, the "Peter" node is not distinct).

MATCH (peter:Person { name: "Peter" }), (james:Person { name: "James" })
CREATE (peter)-[:friend]->(james)

Note, unlike DistinctId mode queries, MultipleValues mode pattern query results can be emitted from each root node. This means that the "Find people with friends" example, if run in the MultipleValues mode, would produce two results (one for each friend) unlike the single result produced in the DistinctId mode.

Sample StandingQueryResult:

{
    "meta": {
        "resultId": "b3c35fa4-2515-442c-8a6a-35a3cb0caf6b",
        "isPositiveMatch": true
    },
    "data": {
        "strId(n)": "a0f93a88-ecc8-4bd5-b9ba-faa6e9c5f95d"
    }
}

Result Outputs

Once a full pattern match occurs, a StandingQueryResult is produced. A standing query can have any number of output destinations to route StandingQueryResults. The output destinations are processed in parallel.

Output Workflows

Quine provides workflow-based output processing with composable multi-stage pipelines. Results flow through these optional stages in order:

  1. Filtering — Route only positive matches, filtering out cancellations
  2. Transformation — Reshape results with InlineData before output
  3. Enrichment — Augment results with additional data via Cypher queries
  4. Destinations — Route to one or more output destinations (Kafka, webhooks, etc.)

Each stage is optional except destinations—you must configure at least one output destination.

Filtering

Filter results before processing with predicates. Use OnlyPositiveMatch to route only positive matches (when the pattern first matches), filtering out cancellations:

{
  "filter": {
    "type": "OnlyPositiveMatch"
  }
}

Standing query results include an isPositiveMatch metadata flag:

  • true - Pattern newly matched (positive match)
  • false - Pattern no longer matches (cancellation)

Transformation

Transform result structure before enrichment. Use InlineData to unwrap the result data from its metadata envelope:

{
  "preEnrichmentTransformation": {
    "type": "InlineData"
  }
}

Before transformation:

{
  "meta": { "isPositiveMatch": true },
  "data": { "id": "abc123", "severity": "high" }
}

After transformation:

{ "id": "abc123", "severity": "high" }

Enrichment

Execute Cypher queries to add context to results before routing:

{
  "resultEnrichment": {
    "query": "MATCH (n)-[:RELATED_TO]->(m) WHERE id(n) = $that.id RETURN m.name AS related",
    "parallelism": 16,
    "allowAllNodeScan": false,
    "shouldRetry": true
  }
}
Setting Description Default
query Cypher query with $that parameter for result data
parallelism Concurrent query executions 32
allowAllNodeScan Permit queries that scan all nodes false
shouldRetry Retry failed queries on recoverable errors true

When referencing node IDs from standing query results in enrichment queries, use the quineId() function to convert the ID value:

MATCH (n) WHERE id(n) = quineId($that.data.id) RETURN n.details

Non-Deterministic Ordering

When parallelism is greater than 1, enrichment queries execute concurrently. This means results may arrive at destinations in a different order than they were produced by the standing query. If ordering matters for your use case, set parallelism: 1.

Idempotency

If your enrichment query is not idempotent and shouldRetry is true, effects may occur multiple times on transient failures.

Output Structure

Control how result data is wrapped for destinations.

Bare - Send raw result data without metadata:

{
  "structure": { "type": "Bare" }
}

WithMetadata - Wrap data with metadata including match status:

{
  "structure": { "type": "WithMetadata" }
}

Output Formats

Destinations that support formats can serialize as JSON or Protobuf.

JSON (Default):

{
  "format": { "type": "JSON" }
}

Protobuf:

{
  "format": {
    "type": "Protobuf",
    "schemaUrl": "http://schema-registry:8081/schemas/ids/1",
    "typeName": "com.example.ResultMessage"
  }
}

Destination JSON Protobuf
Kafka Yes Yes
Kinesis Yes Yes
SNS Yes Yes
ReactiveStream Yes Yes
File Yes No
HTTP Webhook Yes No
Standard Out Yes No
Slack Yes No

Output Destinations

Cypher Query

The Cypher query destination is particularly powerful, making it possible to post-process pattern query results to collect more information from the graph or to filter out matches that don't meet some requirements.

The result object is passed to the Cypher query via the parameter $that, for use in the query Cypher.

Be aware that non-trivial or long-running operations with results will consume system resources and cause the system to backpressure and slow down other processing (like data ingest).

Drop

Drop the current result output and end processing the destination.

POST to Webhook

Makes an HTTP[S] POST for each result. The data in the request payload can be customized in a Cypher query preceding this step.

Publish to Slack

Sends a message to Slack via a configured Slack App webhook URL. See https://api.slack.com/messaging/webhooks.

Setting Description Default
hookUrl Slack webhook URL
onlyPositiveMatchData Only send positive matches (skip cancellations) false
intervalSeconds Minimum seconds between messages 20

Slack limits the rate of messages which can be posted (1 message per second). Quine batches results that arrive faster than the configured intervalSeconds and publishes them as a single aggregated message when the interval allows.

Log JSON to Standard Out

Prints each result as a single-line JSON object to standard output on the Quine server.

This output type can be configured with Complete to print a line for every result, backpressuring and slowing down the stream as needed to print every result.

Or it can be configured with FastSampling to log results in a best effort, by dropping some results to avoid slowing down the stream.

Note that neither option changes the behavior of other outputs registered on the same standing query.

Log JSON to a File

Write each result as a single-line JSON object to a file on the local filesystem.

Publish to Kafka Topic

Publishes a record for each result to the provided Apache Kafka topic. Records can be serialized as JSON or Protocol Buffers before being published to Kafka.

Publish to Kinesis Stream

Publishes a record for each result to the provided Kinesis stream. Records can be serialized as JSON or Protocol Buffers before being published to Kinesis.

Setting Description Default
streamName Kinesis stream name
credentials AWS credentials (optional) None
region AWS region (optional) None
format Output format (JSON or Protobuf) JSON
kinesisParallelism Concurrent publish operations None
kinesisMaxBatchSize Maximum records per batch None
kinesisMaxRecordsPerSecond Rate limit (records/second) None
kinesisMaxBytesPerSecond Rate limit (bytes/second) None

Publish to SNS Topic

Publishes an AWS SNS record to the provided topic containing JSON for each result.

Setting Description Default
topic SNS topic ARN
credentials AWS credentials (optional) None
region AWS region (optional) None
format Output format (JSON or Protobuf) JSON

Credential Validation

Ensure your credentials and topic ARN are correct. If writing to SNS fails, the write will be retried indefinitely. If the error is not fixable (e.g., the topic or credentials cannot be found), the outputs will never be emitted and the output could stop running.

Publish to Reactive Stream

Broadcasts results to a TCP-based reactive stream endpoint. Clients can connect to receive a continuous stream of results.

Setting Description Default
address Address to bind the stream server localhost
port Port to bind the stream server
format Output format (JSON or Protobuf) JSON

Cluster Limitation

Reactive Stream outputs do not function correctly when running in a cluster. Use Kafka or Kinesis for clustered deployments.

Inspecting and Debugging Standing Queries

For detailed guidance on debugging standing queries, see the Troubleshooting Queries guide, which covers:

Since standing queries use a subset of Cypher syntax, you can run the match pattern as a regular query to understand what data would match. When doing so, constrain the starting points if there is already a large amount of data in the system (see Using IDs in a Query)