OBSL vs Cube¶
A feature comparison between OrionBelt Semantic Layer (OBSL) and Cube (formerly Cube.js — the open-source semantic layer from Cube Dev). Captured 2026-05-23.
TL;DR¶
- Cube wins on: pre-aggregations (its flagship feature — materialized rollups with refresh, partitioning, and lambda strategies), GraphQL alongside REST, first-class multi-tenancy and row-level security via
query_rewrite+ JWT security contexts, Twig templating for dynamic models, and the broadest data-source portfolio of the OSS semantic layers. - OBSL wins on: richer modeling topologies (multi-rooted DAG with named secondary join paths) where Cube assumes single-rooted cubes plus combining
views; first-class declarative metric types for cumulative (withpartitionBy), period-over-period, and window (rank / lag / lead / ntile / first_value / last_value) where Cube hasrolling_windowandtime_shiftbut they're query/measure patterns, not metric types, and no native rank/lag surface; 9 statistical aggregates (CORR / COVAR_ / REGR_ / STDDEV_ / VAR_); an RDF/SPARQL graph view of the model and a matching interactive ontology-graph playground; two SQL wire protocols — PostgreSQL wire AND Arrow Flight SQL (v2.5.0+) — where Cube ships Postgres wire only; 8 first-class DB-API 2.0 drivers; an explicit CFL multi-fact planner; OSI v0.2 ↔ OBML format conversion; and a simpler operational footprint (no Redis, no scheduler, no separate query orchestrator). - Different niches: Cube is "the production semantic-layer + caching + API gateway" — built to serve high-volume embedded analytics with millisecond response times. OBSL is "an embeddable semantic compiler with a clean REST surface and rich modeling primitives" — best when you don't need pre-aggregation infrastructure and want a smaller dependency footprint.
Cube is the closest peer to OBSL in the OSS space — both are self-hostable, both target embedded analytics, both expose REST/MCP. The interesting differences are in modeling topology, metric expressivity, and the caching/pre-aggregation layer.
1. Modeling philosophy¶
| Aspect | OBSL (OBML) | Cube |
|---|---|---|
| Format | Declarative YAML (OBML) |
YAML or JavaScript (.yml or .js cube definitions); supports Twig/Jinja templating |
| Source of truth | YAML model files | model/cubes/*.{yml,js} and model/views/*.{yml,js} files |
| Top-level constructs | dataObjects, dimensions, measures, metrics, filters |
cubes (with measures, dimensions, segments, joins, pre_aggregations), views (combining cubes), pre_aggregations |
| Object scoping | Each DataObject has columns:; dimensions/measures/metrics live at model scope |
Measures and dimensions live inside each cube; views expose a curated subset for end users |
| Templating | None (static YAML) | Twig (Jinja2-like) — COMPILE_CONTEXT for compile-time multi-tenancy, dynamic SQL, masking |
| Runtime | OSS, self-hosted (single Python service) | OSS Cube Core (Node.js + Cube Store + optional Redis) or Cube Cloud (managed, paid) |
2. Concept mapping¶
| OBSL | Cube | Notes |
|---|---|---|
DataObject |
cube |
Both wrap a physical table or SQL block |
DataObject.columns |
dimensions on a cube |
|
Dimension (model-scoped) |
n/a — dimensions live inside cubes | OBSL has model-scoped dimensions; Cube does not |
Measure |
measures on a cube |
|
Metric type: derived |
measure: { type: number; sql: ... } referencing other measures |
Both first-class |
Metric type: cumulative |
|
Cube has rolling windows but they live on individual measures, not as a separate metric type |
Metric type: period_over_period |
time_shift in queries (compareDateRange, or ) |
Cube does PoP at query time, not in the model |
DataObjectJoin |
joins: inside a cube with relationship: many_to_one/one_to_many/one_to_one |
Cube uses relationship-driven symmetric aggregates |
| Combining multiple data objects in one query | Native via join graph + CFL | view entity required to combine cubes; views are first-class but distinct from cubes |
secondary: true + pathName |
n/a — multiple joins between the same pair require workarounds | No path-name primitive |
QueryObject JSON |
Cube REST query shape (measures, dimensions, filters, timeDimensions, segments, order, limit) |
Different shape but same idea |
Filter (named, reusable) |
segments (named WHERE-style filters reusable in queries) |
3. The headline Cube features¶
3.1 Pre-aggregations (Cube's flagship)¶
Pre-aggregations are materialized rollups that Cube builds, refreshes, and routes queries through automatically. This is the single biggest Cube feature OBSL has no answer for.
cubes:
- name: orders
pre_aggregations:
- name: monthly_revenue
measures: [total_revenue, order_count]
dimensions: [status]
time_dimension: created_at
granularity: month
partition_granularity: month
refresh_key:
every: 1 hour
build_range_start:
sql: SELECT DATE('2020-01-01')
build_range_end:
sql: SELECT NOW()
Capabilities: - Rollup (most common), original_sql, rollup_join, rollup_lambda strategies - Partitioning for incremental refresh - Cube Store — Cube's columnar materialization engine (SQLite/Parquet-backed) - Query routing — Cube transparently rewrites incoming queries to hit the matching pre-aggregation - Sub-second query response times even on billion-row source tables
OBSL has no materialization story. "Make this faster" is the warehouse's job (materialized views, dbt models, scheduled jobs).
3.2 SQL wire protocols for BI tools (OBSL has both; Cube has Postgres wire only)¶
Both projects expose a SQL wire protocol so BI tools can connect to the semantic layer like a database. As of OBSL v2.5.0, OBSL ships two wires side-by-side so you can pick the right transport per consumer:
| OBSL | Cube | |
|---|---|---|
| PostgreSQL wire | ✅ port 5432 (configurable via PGWIRE_PORT) — works with Tableau, DBeaver, Superset, Metabase, Power BI, plain psql, Dremio's Postgres-source connector, anything with a Postgres driver |
✅ port 15432 |
| Apache Arrow Flight SQL | ✅ gRPC port 8815 — columnar transport, JDBC/ODBC via Flight SQL drivers, pyarrow.flight programmatic |
❌ |
| DB-API 2.0 drivers (PEP 249) | ✅ 8 first-party packages (ob-{bigquery,snowflake,postgres,mysql,duckdb,clickhouse,databricks,dremio}) for direct programmatic access |
❌ (SQL API supplants this) |
| Read-only governance | Closed by design: raw SQL → RAW_SQL_REJECTED, DDL/DML → WRITE_OPERATION_REJECTED, catalog probes answered from the model |
Cube SQL API rejects writes; raw SELECTs flow through |
| Self-hostable | Postgres wire is built into the API process; Flight SQL via ob-flight-extension daemon thread |
Built into Cube core |
So OBSL covers the same broad-BI-tool surface as Cube (Postgres wire is everywhere), plus Arrow Flight SQL for consumers that benefit from modern columnar transport (cheaper round-trips for analytical payloads), plus DB-API drivers for Python-native programmatic access. Pick the wire that matches the consumer — for an existing Tableau farm, point it at OBSL's pgwire and it just works.
3.3 Multi-API parity¶
Cube exposes the same semantic model through:
- REST API (/cubejs-api/v1/load)
- GraphQL API (/graphql)
- SQL API (Postgres wire on port 15432)
- MCP (Model Context Protocol)
OBSL exposes:
- REST API (FastAPI)
- PostgreSQL wire (TCP port 5432, v2.5.0+, BI-tool / psql / Dremio-source compatible)
- Arrow Flight SQL (gRPC port 8815, JDBC/ODBC via Arrow Flight SQL drivers)
- DB-API 2.0 drivers (8 PEP 249 packages)
- MCP
Cube uniquely has GraphQL; OBSL uniquely has Arrow Flight SQL, DB-API drivers, and the RDF/SPARQL surface. Both have REST + Postgres-wire SQL + MCP.
3.4 Multi-tenancy and row-level security¶
Cube has first-class multi-tenancy:
- Compile-time multi-tenancy via
COMPILE_CONTEXTin Twig templates — generate different schemas per tenant. - Runtime RLS via
query_rewrite(query, { securityContext })— inject filters based on JWT claims. - Member-level access control via
public: falseon dimensions/measures, conditional masking in Twig. - JWT-based authentication built into the API gateway.
OBSL has session-scoped models (TTL, max-age, rate limits) but no first-class user/auth model. Authn/authz is the host application's job.
3.5 Caching: different goals, different shapes¶
Both projects have caching, but they're solving adjacent problems with different primitives.
Cube has a tiered caching architecture aimed at high-throughput dashboard serving: in-memory query cache, optional Redis, and Cube Store (its purpose-built rollup store backing pre-aggregations). TTL is configured on each cube/measure as a refresh-key SQL or interval — the model author decides when entries are stale, and Cube's scheduler refreshes pre-aggs in the background. This is built for embedded-analytics workloads serving thousands of requests per second.
OBSL has a result cache based on freshness inheritance (CACHE_BACKEND=file, off by default). The structural difference is where the freshness contract lives:
- Cube: TTL is attached to the abstraction (cube / pre-agg / measure). Two cubes reading the same physical table each declare their own TTL — and they can drift.
- OBSL: TTL is derived from the
refresh:contract on each touched physicaldataObject. The minimum contribution wins. TwodataObjectentries on the same warehouse table inherit the same contract automatically. One ETLPOST /v1/heartbeatinvalidates every cached query that depends on the refreshed table — across every dataObject and every session, in one call.
OBSL ships static | scheduled | heartbeat | unknown modes and exposes GET /v1/cache/stats, POST /v1/cache/sweep, POST /v1/cache/clear for observability and manual control. The backend is single-process (DuckDB metadata + Parquet sidecars), so it's per-replica — not a tiered store like Cube's. For multi-replica deployments, OBSL still recommends CACHE_BACKEND=noop until a shared backend (Redis or similar) lands.
Pre-aggregations are still Cube-only. OBSL caches the result of a query that ran against the warehouse; Cube can additionally pre-materialise rollups in Cube Store and route queries through them, which is a different category of optimisation. If your bottleneck is repeat queries, OBSL's result cache helps. If it's queries that scan billions of rows even on first hit, you still need Cube's pre-aggs (or warehouse-side materialised views, or dbt incremental models).
4. Metric types¶
4.1 Aggregation surface on Measure¶
| Family | OBSL | Cube |
|---|---|---|
| Standard | sum, count, count_distinct, avg, min, max |
sum, count, count_distinct, count_distinct_approx, avg, min, max |
| Shape | any_value, median,mode, listagg |
string, time, boolean,number (generic — wraps anyaggregate SQL expression) |
| Statistical | stddev, stddev_pop, variance, var_pop |
Via type: number + raw sql: STDDEV(...) — not a first-class measure type |
| Association / regression | corr, covar_pop, covar_samp, regr_slope, regr_intercept |
Via type: number + raw SQL — not a first-class measure type |
| Grand totals | total: bool on the measure |
n/a — done at query/visualization time |
The honest difference: Cube's type: number is a generic escape hatch — anything the warehouse can express in SQL is reachable, but the model authors write raw SQL that's dialect-specific, arity-unchecked, and opaque to validation. OBSL ships these as first-class declarative aggregations with arity validation (2-column for corr / covar_* / regr_*), per-dialect gating (UNSUPPORTED_AGGREGATION_FOR_DIALECT at compile time), and identical YAML across all 8 dialects. Both approaches reach the same SQL functions in the end; the difference is who owns dialect portability.
4.2 Metric types¶
| OBSL | Cube | Notes |
|---|---|---|
Metric type: derived |
|
Both first-class |
Metric type: cumulative (running, rolling, grain-to-date, per-dimension partitionBy) |
rolling only |
Cube has rolling windows but no grain-to-date, no unbounded cumulative, and no partitionBy as declarative types |
Metric type: period_over_period (4 comparison modes) |
Query-side: compareDateRange parameter, or |
Cube does PoP at query time, not as a model-defined metric |
Metric type: window — rank, dense_rank, row_number, ntile,lag, lead, first_value, last_value |
Via type: number + raw window-function SQL |
OBSL ships these as declarative metric types; Cube reaches them via the same type: number escape hatch |
| Reusable filter context / filtered measures | segments + per-measure filters |
Comparable |
Different philosophies: OBSL bakes time-aware metrics into the model (write once, every query gets the comparison). Cube treats time comparisons as query-time concerns (more flexible, but the consumer has to know how to ask). Either fits depending on whether you're publishing a metrics catalog or empowering query authors.
See Trend Analysis for OBSL's full v2.6 surface (window functions, statistical aggregates, dialect coverage matrix).
5. Joins¶
| OBSL | Cube | |
|---|---|---|
| Definition site | joins: array on DataObject |
joins: block inside each cube |
| Cardinality | joinType: many-to-one, one-to-one, many-to-many |
relationship: one_to_one, one_to_many, many_to_one |
| What cardinality drives | Static fanout detection + CFL multi-fact planning | Symmetric aggregates |
| Join condition | columnsFrom/columnsTo arrays |
sql: "{CUBE}.id = {other.foo_id}" (free-form SQL with {CUBE} reference) |
| Multiple paths | First-class via secondary: true + named pathName, query-time selection via usePathNames |
No path-name primitive; workaround via views exposing one path or aliased cubes |
| Join direction | Directed, declared per data object | Bidirectional inference based on relationship: |
| Symmetric aggregates | ❌ (uses CFL) | ✅ |
Cube's relationship: keyword drives symmetric aggregate logic (similar to LookML and Malloy). OBSL takes the static-fanout-detection + CFL approach instead.
6. Data modeling topology (a major differentiator)¶
Cube is fundamentally single-rooted per cube: each cube has its own measures and dimensions, joins fan out from there. To combine multiple cubes into a unified end-user surface, Cube uses views — a separate entity that selectively re-exposes measures/dimensions from joined cubes. Views are good, but they don't escape the underlying single-rooted-per-cube topology, and multi-path scenarios (two valid joins between the same pair of cubes) aren't a first-class primitive.
OBSL is built on a directed join graph (DAG) with explicit support for richer topologies:
| Topology | Star (single fact + dims) | Snowflake (chained dims) | Multi-rooted (multiple facts) | Multi-path (alt. joins between same pair) | Cycles |
|---|---|---|---|---|---|
| OBSL | ✅ | ✅ | ✅ via CFL UNION ALL legs with per-leg common root |
✅ first-class via secondary: true + pathName + per-query usePathNames |
Detected and rejected |
| Cube | ✅ | ✅ | Via views combining cubes (workable but indirect) |
Workaround via duplicate cubes or views | Implicit |
Why this matters: When you need a single semantic surface that exposes revenue + support tickets by customer, or to choose between ship_address_id and billing_address_id joins to the same address dimension per query, Cube wants you to design views upstream and pick the right one. OBSL lets you model the messy graph as-is and resolve at query time.
7. Dialects / data sources¶
Cube supports a very broad list including all OBSL dialects plus more.
| Source | OBSL | Cube |
|---|---|---|
| BigQuery | ✅ | ✅ |
| Snowflake | ✅ | ✅ |
| Postgres | ✅ | ✅ |
| MySQL | ✅ | ✅ |
| DuckDB | ✅ | ✅ |
| Databricks | ✅ | ✅ |
| ClickHouse | ✅ | ✅ |
| Dremio | ✅ | ❌ |
| Redshift | ❌ | ✅ |
| Trino / Presto | ❌ | ✅ |
| MS SQL / Oracle | ❌ | ✅ |
| Druid | ❌ | ✅ |
| Athena, Hive, Vertica, etc. | ❌ | ✅ |
Cube wins on breadth (especially legacy enterprise sources); OBSL covers modern cloud warehouses well and is competitive on Databricks/ClickHouse, with Dremio unique to OBSL.
8. APIs / interfaces¶
| OBSL | Cube | |
|---|---|---|
| REST API | Yes — first-party FastAPI service | Yes — /cubejs-api/v1/load, mature |
| GraphQL | No | Yes |
| SQL wire protocol | PostgreSQL wire (port 5432) AND Apache Arrow Flight SQL (gRPC, columnar) — pick per consumer | Postgres wire (port 15432) only |
| DB-API 2.0 drivers | Yes — 8 PEP 249 drivers shipped (ob-{bigquery,snowflake,postgres,mysql,duckdb,clickhouse,databricks,dremio}) |
No (SQL API supplants this) |
| MCP | Yes — first-party server | Yes — first-party server |
| Native SDKs | Python (FastAPI client) + DB-API drivers | JS/TS (@cubejs-client/*), React, Vue, Angular bindings |
| UI / Playground | Interactive Gradio playground: SQL Compiler, Query Results, auto-generated Mermaid ER diagrams, interactive RDF/OBSL ontology graph (vis-network), OSI import/export, settings panel | Cube Playground (OSS): query builder, schema browser, generated SQL preview. Cube Cloud Studio (paid) adds advanced workspace features |
| RDF graph + SPARQL | Yes (/graph, /sparql) |
No |
| Format conversion | OSI ↔ OBML (/convert/*) |
No equivalent |
Cube's SQL API is a structural advantage for BI-tool connectivity. OBSL's RDF/SPARQL surface is unique for governance/lineage tooling.
9. Open-source vs. commercial story¶
Both projects ship a free OSS core and offer commercial extensions, but the split is different:
| OBSL | Cube | |
|---|---|---|
| Core license | Source-available (BSL 1.1) | Apache 2.0 |
| Self-hostable | ✅ (one Python service) | ✅ (Cube Core: Node.js + optional Cube Store + optional Redis) |
| Commercial offering | Embedded analytics license · commercial cloud offering · enterprise features · consulting + support | Cube Cloud — managed runtime, multi-cluster, advanced security, Studio IDE, paid |
| Operational footprint | Light: one process, in-memory sessions, optional file-backed result cache (DuckDB metadata + Parquet) on local disk | Heavier: API server + Cube Store + Redis (optional) + scheduler + refresh workers in production |
| Self-host parity | OSS has full parity on the shipped v2.6 surface; enterprise tier adds enterprise-specific capabilities on top | Many advanced features (Studio, advanced workspaces, AI features) are Cube Cloud-only |
For a small embedded-analytics use case OBSL is operationally simpler. For high-throughput multi-tenant production with heavy caching across replicas, Cube's architecture is purpose-built and Cube Cloud provides the managed experience. OBSL's file cache based on freshness inheritance (v2.2.0) covers single-replica result-caching workloads — agents, dev/staging, modest production — without standing up Redis or a rollup store.
10. Other distinctives¶
| Feature | OBSL | Cube |
|---|---|---|
| Pre-aggregations / materialized rollups | ❌ | ✅ flagship |
| SQL wire protocol for BI tools | ✅ PostgreSQL wire + Arrow Flight SQL (two wires side-by-side, v2.5.0+) | ✅ Postgres wire only |
| DB-API 2.0 drivers (PEP 249) | ✅ 8 drivers shipped | ❌ |
| Interactive RDF/ontology graph in playground | ✅ vis-network ontology view | ❌ (schema browser only) |
| GraphQL API | ❌ | ✅ |
| Twig/Jinja templating | ❌ | ✅ |
| Multi-tenancy primitives | Session-scoped only | ✅ first-class (compile-time + runtime) |
| Row-level security in model | ❌ | ✅ via query_rewrite |
| Field-level masking | ❌ | ✅ via Twig conditionals |
| First-class PoP metric type | ✅ (4 comparison modes) | ❌ (time_shift at query time) |
| First-class cumulative metric type | ✅ (running, rolling, grain-to-date, partitionBy) |
Partial (rolling_window only) |
| First-class window metric type (rank / lag / lead / ntile / first_value / last_value) | ✅ | Via type: number + raw SQL |
Statistical aggregates (stddev, variance, corr, covar_*, regr_*) as first-class measure types |
✅ 9 declarative aggregations | Via type: number + raw SQL |
| Multi-rooted DAG modeling | ✅ via CFL | ❌ (single-rooted cubes + views) |
| Named secondary join paths | ✅ | ❌ |
| Symmetric aggregates | ❌ (uses CFL) | ✅ |
| RDF/SPARQL graph view | ✅ | ❌ |
| OSI ↔ OBML conversion | ✅ | ❌ |
| Built-in caching layer | ❌ | ✅ multi-tier |
| Cube Store / materialization engine | n/a | ✅ |
| Operational simplicity | Single Python process | Multiple services in production |
11. When to pick which¶
Pick Cube when:¶
- You need pre-aggregations for sub-second analytics on large datasets — this is Cube's defining feature and OBSL has no equivalent.
- You need GraphQL alongside REST (e.g. for a frontend that already uses Apollo).
- You need first-class multi-tenancy + RLS without writing it yourself.
- Your model needs dynamic SQL via templating (Twig/Jinja) — per-tenant table names, masking, conditional logic.
- You're building high-throughput embedded analytics with thousands of concurrent dashboard requests.
- You're willing to operate Node.js + Cube Store + Redis, or pay for Cube Cloud.
Pick OBSL when:¶
- You need richer modeling topologies — multi-rooted facts, named alternative join paths — and don't want to design around them via views.
- You want first-class declarative cumulative and period-over-period metric types instead of expressing them via query-time
time_shiftor per-measurerolling_window. - You want a graph view of the model (RDF/SPARQL) for governance/lineage tooling.
- You need OSI interoperability for moving models between semantic layer formats.
- Your operational appetite is small — one Python service, no Redis, no scheduler, no separate query orchestrator.
- You're targeting Dremio or otherwise want full feature parity in self-hosted OSS without a Cloud upgrade path.
- Your consumers are agents/LLMs primarily — both projects expose MCP, but OBSL's smaller surface is easier to point an agent at without query-shape ambiguity.
They could coexist¶
A workable hybrid: use Cube as the production query gateway with pre-aggregations and a SQL API for BI tools, and OBSL as a complementary modeling/governance surface (RDF graph, OSI export, richer topology modeling, result cache based on freshness inheritance for agent workloads) that you can keep authoritative and mirror into Cube definitions. Two semantic surfaces is operationally heavier, but it's defensible if you need both Cube's pre-aggs and OBSL's modeling primitives.
12. Gap analysis¶
To match Cube, OBSL would need:¶
- Pre-aggregations / materialization — declarative rollup definitions, refresh strategies, query routing. This is a major piece of infrastructure (a Cube-Store-equivalent or an integration with an external materialization engine like dbt/MaterializedView). OBSL's result cache shipped in v2.2.0 (file backend, freshness-derived TTL) addresses the "repeat-query" axis but not the "first-hit on a billion rows" axis pre-aggs cover.
- GraphQL API — modest effort given the existing FastAPI surface.
- Templating in the model — Jinja2 or similar for compile-time dynamism.
- Multi-tenancy primitives — compile-time per-tenant model generation, runtime RLS hooks, JWT integration.
- Field-level masking and access grants — analogous to LookML's
required_access_grants. - More dialects — Trino/Presto, Redshift, MSSQL, Oracle if those markets matter.
- Shared cache backend for multi-replica deployments — current file backend is per-replica; Cube's Redis/Cube-Store layer is shared across the cluster.
To match OBSL, Cube would need:¶
- First-class cumulative & period-over-period metric types — declarative versions of what's currently
rolling_windowand query-sidetime_shift, including per-dimensionpartitionByon cumulative. - First-class window metric type —
rank,dense_rank,row_number,ntile,lag,lead,first_value,last_valueas declarative metric types rather than reaching them throughtype: number+ raw window-function SQL. - First-class statistical / regression aggregations —
stddev,variance,corr,covar_*,regr_slope,regr_interceptas declarative measure types with arity validation and per-dialect gating, rather than viatype: number+ raw SQL. - Multi-rooted DAG modeling — the ability to query across genuinely independent fact tables in one go without the consumer needing to pre-design a
view. - Named secondary join paths with per-query selection.
- RDF/SPARQL graph surface for governance/lineage.
- Apache Arrow Flight SQL as a columnar wire protocol option alongside Postgres wire — modern analytical payloads transfer more efficiently in Arrow.
- First-class DB-API 2.0 drivers for direct programmatic access from Python (PEP 249).
- Interactive RDF/ontology graph visualization in the playground (vis-network-style) — Cube Playground's schema browser is functional but not graph-shaped.
- OSI ↔ Cube model round-trip for portability between semantic layers.
References¶
- OBSL
MetricTypeenum:src/orionbelt/models/semantic.py - OBSL CFL planner:
src/orionbelt/compiler/cfl.py - OBSL fanout detection:
src/orionbelt/compiler/fanout.py - OBSL docs: Model Format, Period-over-Period Metrics, Compilation Pipeline
- Cube docs: https://cube.dev/docs
- Cube data modeling reference: https://cube.dev/docs/product/data-modeling/reference
- Cube pre-aggregations: https://cube.dev/docs/product/caching/using-pre-aggregations
- Cube SQL API: https://cube.dev/docs/product/apis-integrations/sql-api