Common Pitfalls¶
Quine is a streaming graph — not a database. Many patterns that work in traditional databases or graph databases like Neo4j will silently produce wrong results, lose data, or cause severe performance problems in Quine. Read this page before writing your first ingest query.
For Cypher-specific differences from Neo4j, see Quine Cypher vs. Neo4j Cypher.
Ingest Queries¶
Every query must anchor nodes by ID¶
Any query that finds nodes by property instead of by ID triggers a full scan of every node in the graph. This includes MERGE with property matchers.
-- WRONG: scans all nodes
MATCH (user) WHERE user.email = "alice@example.com" ...
-- CORRECT: direct lookup
MATCH (user) WHERE id(user) = idFrom("user", "alice@example.com") ...
Details → Troubleshooting Ingest
Mismatched idFrom() arguments silently orphan data¶
If one ingest stream uses idFrom("order", orderId) and another uses idFrom(orderId) without the prefix, they address different nodes. Relationships between them will never form, and there is no error message.
Details → Troubleshooting Ingest
Ingest queries run in parallel — don't depend on other records¶
Each record is processed independently and concurrently. A query that assumes another record has already been ingested (e.g., looking up a customer node by property to attach an order) will silently skip records when the dependency hasn't been processed yet.
Solution: Make each ingest query self-contained. Every record should create all the nodes and edges it references, using idFrom() to address them. Order of arrival should not matter.
Details → Troubleshooting Ingest
Ingest queries should be idempotent¶
At-least-once delivery means records may be processed more than once after a restart. Design queries so that processing the same record twice produces the same graph state.
Standing Queries¶
DistinctId mode fires once per root node, not per match¶
In the default DistinctId mode, once a pattern matches for a given root node, additional matches from that same root node do not emit new results. If you need a result for every match, use MultipleValues mode.
Details → Standing Query Modes
Results are best-effort and can be dropped¶
Standing query outputs are not durably queued. Results can be lost if the output queue overflows, an output destination fails, or the process restarts. Treat standing query outputs as notifications, not authoritative records.
New standing queries fire on existing data¶
When you register a standing query, it evaluates against all data already in the graph — not just new data arriving after registration. This can produce a burst of results from historical data.
Details → Troubleshooting Queries
Performance¶
Counting nodes is expensive¶
MATCH (n) RETURN count(*) scans the entire graph. Unlike a database, Quine does not maintain a running node count.
Details → Streaming Graph vs. Database
Labels do not improve query performance¶
In Neo4j, MATCH (p:Person) uses a label index. In Quine, labels are stored as properties with no index — filtering by label still requires scanning all nodes.
Supernodes degrade traversal performance¶
Nodes with thousands of edges (supernodes) cause performance problems when queries traverse outward from them. Supernodes that are only traversed to (as endpoints) are fine. Consider using properties instead of edges for high-cardinality relationships, or partitioning supernodes by time period.
Details → Diagnosing Bottlenecks
Default parallelism may not be optimal¶
Ingest parallelism defaults to 16. The right value depends on your data, queries, and infrastructure. Experimenting with this value can significantly improve throughput. When running multiple ingests on the same host, divide the optimal single-ingest parallelism across them.
Details → Troubleshooting Ingest
Standing query backpressure slows ingest¶
If standing queries can't keep up with ingest, Quine pauses ingest to prevent result loss. Monitor the shared.valve.ingest metric — a non-zero value means ingest is being throttled.
Details → Diagnosing Bottlenecks
Operations¶
Graceful shutdown is required to prevent data loss¶
Use the POST /api/v2/system:shutdown endpoint to shut down cleanly. A hard kill (SIGKILL, container eviction) can lose data that hasn't been persisted yet.
Details → Operational Considerations
JVM heap should not exceed 16GB¶
Garbage collection pauses grow significantly above 16GB heap. 12GB is a good starting point. Additional physical memory beyond the heap is needed for off-heap overhead — budget 25–33% extra.
Recipes use temporary storage by default¶
When Quine launches a recipe, it creates a temporary data store in the system temp directory. Each subsequent launch replaces the previous data. Use --force-config with a persistent data path to retain data between runs.