Quine Recipes¶

In this article we will cover Quine recipes -- what are they for, their components, and how to use them to quickly iterate through the design-code-test portion of the development lifecycle.

What is a Quine Recipe¶

A recipe is a collection of configurations that sets a Quine instance up for a specific purpose. A recipe is defined in a 'yaml' file and contains configuration for: ingest streams, standing queries, UI configuration, and some metadata about the recipe. Recipes are a great format for sharing what you've created, and for quickly iterating on your thoughts. Quine recipe components correspond directly to Quine's REST API elements. This means patterns and behaviors developed in a 'dev' environment can be applied to the production environment with only slight modifications.

Some differences between using recipes vs using API calls:¶

API calls allow for naming of ingest streams and standing queries. In recipes, they are named for you in the format of INGEST-# or STANDING-# By default, Quine launched with recipes creates a temporary persistent data store (in your tmp dir). Each subsequent launch of Quine w/ a recipe replaces this temporary data store

These differences between API calls and Recipes make them well suited for different things:

Recipes are fantastic at quick iteration. No need to name your ingests and standing queries, and your data store is automatically cleaned up when Quine is restarted
API calls are the way to setup Quine in production. You want to persist your data, even between Quine restarts, and you will want to name your ingest streams and standing queries

A recipe allows you to:

Configure ingest streams.
Configure standing queries.
Configure the Quine Exploration UI for graph analysis.
Iterate rapidly during the development phase.

Recipes are covered in detail inside the recipe reference

Differences Between Recipes and the REST API¶

There are a couple of operational differences that you need to keep in mind when launching Quine along with a recipe.

API calls allow naming - When you create an ingest stream or a standing query using the API, you choose the name for the object in the URL. The corresponding recipe object uses standardized names in the form of INGEST-# and STANDING-#.
Persistent storage - Starting the Quine application jar without providing a configuration file creates a persistent data store in the local directory. That persistor retains the previous graph state and is appended to each time Quine starts. Alternatively, when Quine launches a recipe, a temporary persistent data store is created in the system tmp directory. Each subsequent launch of the recipe will replace the data store, and discard the graph from the previous run. You can override the default temporary storage behavior with your configuration settings by using the --force-config command line flag.

Recipe Structure¶

A recipe is stored in a single YAML text file. The file must contain a single object with the following attributes:

Attribute	Type	Description
`version`	Integer	Schema versioning; only supported value is 1
`title`	String	Identifies the Recipe
`contributor`	String	URL to social profile of the person or organization responsible for this Recipe
`summary`	String	Brief information about this Recipe
`description`	Text Block	Long form description about this Recipe
`ingestStreams`	Array of IngestStream objects	Define how data is read from data sources, transformed, and loaded into the graph
`standingQueries`	Array of StandingQuery objects	Define both sub-graph patterns for Quine to match and subsequent output actions
`nodeAppearances`	Array of NodeAppearance objects	Customize node appearance in the exploration UI
`quickQueries`	Array of QuickQuery objects	Add queries to node context menus in the exploration UI
`sampleQueries`	Array of SampleQuery objects	Customize sample queries listed in the exploration UI
`statusQuery`	A CypherQuery object	OPTIONAL Cypher query that is executed and reported to the terminal window during execution

Use the yaml structure below as a starting point when developing your recipes.

version: 1
title: The title of the recipe goes here
contributor: Your GitHub profile link here
summary: Single line summary of your recipe
description: |-
  Long format description of your recipe
ingestStreams: [ ]
standingQueries: [ ]
nodeAppearances: [ ]
quickQueries: [ ]
sampleQueries: [ ]
statusQuery: null

Now let's build a recipe to reproduce the use case from the Getting Started scenario. Remember our goal:

For the sake of this tutorial, assume that you need to separate human-generated events from bot-generated events in the English Wikipedia database and send them to a destination in your data pipeline for additional processing.

Recipe Metadata¶

The first few lines of a Quine recipe contain information about who wrote the recipe and what the recipe is intended to do.

Starting out with the recipe template from above, we can fill in the version, title, contributor, summary, and description for our recipe.

version: 1
title: Wikipedia non-bot page update event stream
contributor: https://github.com/maglietti
summary: Stream page-update events that were not created by bots
description: |-
  This recipe will separate human generated events from bot generated
  events in the english wikipedia database page-update event stream
  and store them for additional processing.
  API Reference: https://stream.wikimedia.org/?doc#/streams/get_v2_stream_mediawiki_revision_create

Ingest Stream¶

Ok, now we need to transform the ingest stream object that we POSTed to /api/v1/ingest/{name} from JSON to YAML. The simplest way to convert the JSON API body to YAML is to use a tool like YAML ❤ JSON in VSCode.

Here's the JSON version of the ingest stream that we developed earlier in the ingest streams getting started tutorial.

{
  "format": {
    "query": "MATCH (revNode),(pageNode),(dbNode),(userNode),(parentNode) WHERE id(revNode) = idFrom('revision', $that.rev_id) AND id(pageNode) = idFrom('page', $that.page_id) AND id(dbNode) = idFrom('db', $that.database) AND id(userNode) = idFrom('id', $that.performer.user_id) AND id(parentNode) = idFrom('revision', $that.rev_parent_id) SET revNode = $that, revNode.bot = $that.performer.user_is_bot, revNode:revision SET parentNode.rev_id = $that.rev_parent_id SET pageNode.id = $that.page_id, pageNode.namespace = $that.page_namespace, pageNode.title = $that.page_title, pageNode.comment = $that.comment, pageNode.is_redirect = $that.page_is_redirect, pageNode:page SET dbNode.database = $that.database, dbNode:db SET userNode = $that.performer, userNode.name = $that.performer.user_text, userNode:user CREATE (revNode)-[:TO]->(pageNode), (pageNode)-[:IN]->(dbNode), (userNode)-[:RESPONSIBLE_FOR]->(revNode), (parentNode)-[:NEXT]->(revNode)",
    "parameter": "that",
    "type": "CypherJson"
  },
  "type": "ServerSentEventsIngest",
  "url": "https://stream.wikimedia.org/v2/stream/mediawiki.revision-create"
}

And the converted YAML version, edited for readability and style

ingestStreams:
  - type: ServerSentEventsIngest
    url: https://stream.wikimedia.org/v2/stream/mediawiki.revision-create
    format:
      type: CypherJson
      parameter: that
      query: |-
        MATCH (revNode),(pageNode),(dbNode),(userNode),(parentNode)
        WHERE id(revNode) = idFrom('revision', $that.rev_id) 
          AND id(pageNode) = idFrom('page', $that.page_id) 
          AND id(dbNode) = idFrom('db', $that.database)
          AND id(userNode) = idFrom('id', $that.performer.user_id) 
          AND id(parentNode) = idFrom('revision', $that.rev_parent_id)

        SET revNode = $that,
            revNode.bot = $that.performer.user_is_bot,
            revNode:revision

        SET parentNode.rev_id = $that.rev_parent_id

        SET pageNode.id = $that.page_id, 
            pageNode.namespace = $that.page_namespace, 
            pageNode.title = $that.page_title, 
            pageNode.comment = $that.comment, 
            pageNode.is_redirect = $that.page_is_redirect, 
            pageNode:page 

        SET dbNode.database = $that.database, 
            dbNode:db 

        SET userNode = $that.performer, 
            userNode.name = $that.performer.user_text, 
            userNode:user 

        CREATE (revNode)-[:TO]->(pageNode),
               (pageNode)-[:IN]->(dbNode),
               (userNode)-[:RESPONSIBLE_FOR]->(revNode),
               (parentNode)-[:NEXT]->(revNode)

Standing Query¶

We can transform the standing query from the standing queries tutorial the same way that we transformed the ingest stream.

JSON:

{
  "pattern": {
    "query": "MATCH (userNode:user {user_is_bot: false})-[:RESPONSIBLE_FOR]->(revNode:revision {database: 'enwiki'}) RETURN DISTINCT id(revNode) as id",
    "type": "Cypher"
  },
  "outputs": {
    "print-output": {
      "type": "CypherQuery",
      "query": "MATCH (n) WHERE id(n) = $that.data.id RETURN properties(n)",
      "andThen": {
        "type": "PrintToStandardOut"
      }
    }
  }
}

YAML:

standingQueries:
  - pattern:
      query: |-
        MATCH (userNode:user {user_is_bot: false})-[:RESPONSIBLE_FOR]->(revNode:revision {database: 'enwiki'})
        RETURN DISTINCT id(revNode) as id
      type: Cypher
    outputs:
      print-output:
        type: CypherQuery
        query: |-
          MATCH (n)
          WHERE id(n) = $that.data.id
          RETURN properties(n)
        andThen:
          type: PrintToStandardOut

At this point, our recipe will produce exactly the same running configuration that we accomplished with the API calls submitted during the previous sections in this getting started tutorial.

Running a recipe¶

Recipes are launched by passing the YAML file as an argument to the Quine jar file using -r.

I saved the recipe elements that we created above into a file named wikipedia-non-bot-revisions.yaml on my laptop. The recipe file is in the same directory that I have the quine-x.x.x.jar file.

Launching Quine and the recipe.

❯ java -jar quine-1.7.0.jar -r wikipedia-non-bot-revisions.yaml
Graph is ready
Running Recipe Wikipedia non-bot page update event stream
Running Standing Query STANDING-1
Running Ingest Stream INGEST-1
Quine web server available at http://127.0.0.1:8080

Notice that Quine announced the recipe title, provided the names it generated for the standing query and ingest stream, then immediately began outputting revision events to the console.

2022-08-30 09:13:58,636 Standing query `print-output` match: {"meta":{"isPositiveMatch":true,"resultId":"9f4650d3-eb01-4a95-9836-fa17e932d430"},"data":{"properties(n)":{"$schema":"/mediawiki/revision/create/1.1.0","comment":"z","database":"enwiki","meta":{"domain":"en.wikipedia.org","dt":"2022-08-30T14:13:57Z","id":"63a721ff-a3d0-4ab8-a9c2-0890642a2696","offset":2843700515,"partition":0,"request_id":"de759ed9-afb4-4c78-812c-70255d2c6b6b","stream":"mediawiki.revision-create","topic":"eqiad.mediawiki.revision-create","uri":"https://en.wikipedia.org/wiki/User:Peter_I._Vardy/sandbox"},"page_id":8188575,"page_is_redirect":false,"page_namespace":2,"page_title":"User:Peter_I._Vardy/sandbox","parsedcomment":"z","performer":{"user_edit_count":205321,"user_groups":["autoreviewer","extendedconfirmed","reviewer","*","user","autoconfirmed"],"user_id":2675188,"user_is_bot":false,"user_registration_dt":"2006-11-06T14:56:58Z","user_text":"Peter I. Vardy"},"rev_content_changed":true,"rev_content_format":"text/x-wiki","rev_content_model":"wikitext","rev_id":1107535873,"rev_len":6319,"rev_minor_edit":false,"rev_parent_id":1107530092,"rev_sha1":"j67dl0l3cxot84gy2qft4rs2jnsb3cz","rev_slots":{"main":{"rev_slot_content_model":"wikitext","rev_slot_origin_rev_id":1107535873,"rev_slot_sha1":"j67dl0l3cxot84gy2qft4rs2jnsb3cz","rev_slot_size":6319}},"rev_timestamp":"2022-08-30T14:13:57Z"}}}

Let's pause the ingest stream for now to stop the scrolling in our terminal window. Remember that for recipes, the name of the ingest stream is assigned when the recipe is run.

curl -X "PUT" "http://127.0.0.1:8080/api/v1/ingest/INGEST-1/pause"

Also notice that interleaved with the revision events, Quine displays a running total of events processed by the ingest stream and standing query. These are easier to see once the ingest stream is paused.

 | => STANDING-1 count 50
 | => INGEST-1 status is paused and ingested 715

Depending on your application, you may choose to leave the nodeApperances, quickQueries and sampleQueries arrays empty. When configured, they can improve the readability and ease of analysis of the streaming graph within the exploration UI. We cover the Exploration UI in another tutorial.

Status Query¶

Depending on your event stream, including an optional status query in your recipe can provide status updates to the terminal window while a less frequent event is waiting for a standing query match. Consider including a statusQuery in your recipe if you need to track more complex metrics about events processed by the ingest stream. Remember that a recipe will output the count of events processed by a standing query by default.

statusQuery: 
  cypherQuery: MATCH (n) RETURN distinct labels(n), count(*)

Do not include the statusQuery attribute in your recipe file if you do not intend to use it.

Next Steps¶

Great job making it this far. You should now have the fundamental knowledge to add Quine into an event streaming data pipeline.

Have a question, suggestion, or did you get stuck somewhere? We welcome your feedback! Please join the Quine Community and let us know. The team is always happy to discuss Quine and answer your questions.