Quick Start¶
What is Quine¶
Quine combines the real-time event stream processing capabilities of systems like Flink and ksqlDB with a graph-structured data model as found in graph databases like Neo4j and TigerGraph. Quine is a key participant in a streaming event data pipeline that consumes data, builds it into a graph structure, runs computation on that graph to answer questions or compute results, and then stream them out.
All together, Quine can:
- Ingest high-volume streaming event data from stream processing systems like Kafka, Kinesis, APIs or databases/data warehouses
- Convert and merge it into durable, versioned, connected data (a graph)
- Monitor that connected data for complex structures or values
- Trigger arbitrary computation in the event of each match
- Emit high-value events out to stream processing systems like Kafka, Kinesis, APIs or databases/data warehouses
This collection of capabilities represents a complete system for stateful event-driven arbitrary computation in a platform scalable to any size of data or desired throughput.
Before you begin¶
Concepts that you should already be familiar with:
- Event driven architectures
- Graph database concepts
- Cypher graph query language
- Interacting with REST API endpoints
If you are unsure of any of these, please consider familiarizing yourself with them in depth before you continue. Start with Installing Quine for a more thorough introduction to Quine and streaming graph concepts.
Install Quine¶
Start Quine using a distribution or locally compiled source code (described below).
From a Docker container¶
- With Docker installed, run Quine from Docker Hub.
docker run -p 8080:8080 thatdot/quine
From an executable¶
Quine requires a Java 11 or newer JRE.
- Download the executable
jar
file. - From a working directory:
java -jar quine-1.7.3.jar
From source code¶
- Clone the source code from GitHub.
- Ensure that you have a recent Java Development Kit (11 or newer) and
sbt
installed. - From the main directory of the repository on your machine:
sbt quine/run
Connect to Quine¶
There are two main structures that you need to configure in Quine; the ingest stream forms the event stream into the graph, and standing queries which match and take action on nodes in the graph. Follow the links in the tutorial to the API documentation to learn more about the schema for each object.
Connect an Event Stream¶
For example, let's ingest the live stream of new pages created on Wikipedia; mediawiki.page-create .
Create a "server sent events" ingest stream to connect Quine to the page-create
event stream using the ingest API endpoint.
Issue the following curl
command in a terminal running on the machine were you started Quine.
curl -X "POST" "http://127.0.0.1:8080/api/v1/ingest/wikipedia-page-create" \
-H 'Content-Type: application/json' \
-d $'{
"format": {
"query": "CREATE ($that)",
"parameter": "that",
"type": "CypherJson"
},
"type": "ServerSentEventsIngest",
"url": "https://stream.wikimedia.org/v2/stream/page-create"
}'
Congratulations! You are ingesting raw events into Quine and manifesting nodes in the graph.
Let's look at a node to see what it contains by submitting a Cypher request via the query/cypher API endpoint.
curl -X "POST" "http://127.0.0.1:8080/api/v1/query/cypher" \
-H 'Content-Type: text/plain' \
-d "CALL recentNodes(1)"
This query calls the recentNodes
Cypher procedure to retrieve the most recent one (1) node.
{
"columns": [
"node"
],
"results": [
[
{
"id": "7a9a936f-ae1a-49c5-ba99-0ec6401bfd7d",
"labels": [],
"properties": {
"database": "enwikisource",
"rev_slots": {
"main": {
"rev_slot_content_model": "proofread-page",
"rev_slot_origin_rev_id": 12558576,
"rev_slot_sha1": "lqrhvc49cgzegvvfqnzg3c6bpxs55up",
"rev_slot_size": 1699
}
},
"rev_id": 12558576,
"rev_timestamp": "2022-08-23T18:34:25Z",
"rev_len": 1699,
"rev_minor_edit": false,
"parsedcomment": "<span dir=\"auto\"><span class=\"autocomment\">Proofread</span></span>",
"page_title": "Page:The_Works_of_H_G_Wells_Volume_6.pdf/423",
"rev_content_format": "text/x-wiki",
"page_id": 4034745,
"page_is_redirect": false,
"meta": {
"domain": "en.wikisource.org",
"dt": "2022-08-23T18:34:25Z",
"id": "3c8c9150-19aa-4f33-be5d-bf3ef8d8a994",
"offset": 241649600,
"partition": 0,
"request_id": "0ba80d5c-3c97-4971-b0d2-360c5d20e0f6",
"stream": "mediawiki.page-create",
"topic": "eqiad.mediawiki.page-create",
"uri": "https://en.wikisource.org/wiki/Page:The_Works_of_H_G_Wells_Volume_6.pdf/423"
},
"page_namespace": 104,
"rev_sha1": "lqrhvc49cgzegvvfqnzg3c6bpxs55up",
"comment": "/* Proofread */",
"rev_content_model": "proofread-page",
"$schema": "/mediawiki/revision/create/1.1.0",
"performer": {
"user_edit_count": 10176,
"user_groups": [
"autopatrolled",
"*",
"user",
"autoconfirmed"
],
"user_id": 141433,
"user_is_bot": false,
"user_registration_dt": "2009-07-12T12:33:52Z",
"user_text": "MER-C"
}
}
}
]
]
}
The API call is the functional equivalent to issuing the CALL recentNodes(1)
query in the Exploration UI:
Note
Your API response will contain a different set of parameters than above because you are ingesting a stream of live events from Wikipedia.
This ingest stream is performing the most basic of ETL functionality, it manifests a disconnected node directly from each event emitted from the Wikipedia event stream.
Create a Standing Query¶
A Standing Query matches some graph structure incrementally while new event data is ingested. Creating a standing query is done with a single call to the standing/query API endpoint.
Right now, Quine is the only component in our data pipeline. Let's configure a standing query that watches for new nodes to enter the graph and print the node contents to the console.
Note
A standing query can emit the event data, re-form the event into new events, or trigger actions to inform elements downstream in your data pipeline (e.g., a Kafka topic).
curl -X "POST" "http://127.0.0.1:8080/api/v1/query/standing/wikipedia-new-page-node" \
-H 'Content-Type: application/json' \
-d $'{
"pattern": {
"query": "MATCH (n) RETURN DISTINCT id(n)",
"type": "Cypher"
},
"outputs": {
"print-output": {
"type": "PrintToStandardOut"
}
}
}'
You will see new node events similar to the one below appear in the same console window where you launched Quine immediately after running the curl
command. These events contain the id
of each new node created in the graph.
2022-08-23 14:06:55,174 Standing query `print-output` match: {"meta":{"isPositiveMatch":true,"resultId":"dab367a3-b272-7dba-c12e-a65bc9f5e0b8"},"data":{"id(n)":"911d88e0-413a-42bd-a0f8-dd15bbf6aff6"}}
Ending the Stream¶
This quick-start is a foundation that you can build on top of to ingest and interpret your own streams of data. But for now, we can pause the ingest stream and shutdown Quine before moving on.
curl -X "PUT" "http://127.0.0.1:8080/api/v1/ingest/wikipedia-page-create/pause"
curl
will return a confirmation that the ingest stream is paused and metrics about what had been ingested to that point.
{
"name": "wikipedia-page-create",
"status": "Paused",
"settings": {
"format": {
"query": "CREATE ($that)",
"parameter": "that",
"type": "CypherJson"
},
"url": "https://stream.wikimedia.org/v2/stream/page-create",
"parallelism": 16,
"type": "ServerSentEventsIngest"
},
"stats": {
"ingestedCount": 3096,
"rates": {
"count": 3096,
"oneMinute": 1.1004373606561983,
"fiveMinute": 1.1045410320126854,
"fifteenMinute": 1.123947504968256,
"overall": 1.12992666124191
},
"byteRates": {
"count": 4471549,
"oneMinute": 1529.8568026569903,
"fiveMinute": 1553.9585381108302,
"fifteenMinute": 1614.01351325884,
"overall": 1631.9520432126692
},
"startTime": "2022-08-23T18:30:58.571823Z",
"totalRuntime": 2739271
}
}
You can stop Quine by either typing CTRL-c
into the terminal window or perform a graceful shut down by issuing a POST to the admin/shutdown endpoint.
curl -X "POST" "http://127.0.0.1:8080/api/v1/admin/shutdown"
Next Steps¶
Learn how Quine uses recipes to store a graph configuration and UI enhancements in the recipes getting started guide.