English · 01:44:00 Jan 27, 2026 4:09 AM
Rafal Dittwald, “Data Oriented Programming”
SUMMARY
Rafal Dittwald, a Toronto entrepreneur and Clojure engineer with a decade of experience building web apps, presents on data-oriented programming—favoring generic data structures over objects—and data-driven programming—using data for DSLs—in the Clojure community, exploring definitions, benefits, tradeoffs, and cross-language applicability amid audience questions.
STATEMENTS
- Data-oriented programming involves choosing generic data structures like hash maps and arrays over objects and typed records throughout applications.
- In Clojure, data-oriented programming means passing plain data representations, similar to JSON objects, as the primary way to handle information.
- Data-driven programming prefers generic data structures instead of code for tasks like creating domain-specific languages, deriving functionality from data.
- Clojure's functional programming focus and dynamic typing enable data-oriented practices by allowing flexible use of primitives without strict types.
- Generic data structures in Clojure include hash maps, vectors, lists, sets, numbers, strings, keywords, and namespaces, used everywhere in programs.
- HTTP requests in Clojure's Ring spec are represented as plain hash maps, passed to functions and returned as responses without object methods.
- Database interactions in Clojure often use plain data structures for queries and results, avoiding specialized response objects.
- Using generic data provides access to Clojure's rich standard library functions for manipulation, like merging, diffing, or extracting keys and values.
- Tradeoffs of data-oriented programming include losing static typing benefits like compile-time checks, documentation, and validation.
- Clojure uses runtime specs for validation at edges, such as incoming HTTP data, rather than throughout the application.
- Encapsulation is downplayed in data-oriented programming, as data is open, but functional practices mitigate propagation issues.
- Performance may suffer with generic data for certain cases like video streams, but it's the default unless impractical.
- Languages like Ruby and JavaScript allow generic data but cultural norms favor classes and objects.
- Clojure's literal syntax for primitives makes creating generic data structures easy and expressive.
- JSON's ubiquity shows generic data is sufficient for modeling complex APIs despite being less rich than XML.
- Clojure's immutable data and value semantics enable easy comparisons and manipulations without deep traversal.
- Data-oriented programming treats plain data as the lingua franca, sacrificing type safety for flexibility.
- Runtime type errors are possible in dynamic languages, mitigated by tests and specs in Clojure.
- Refactoring in data-oriented codebases relies on tests and functional purity rather than compiler feedback.
- Clojure codebases default to data-oriented styles, with records used sparingly despite availability.
- In Ruby, shifting to data-oriented would mean using hash maps for entities instead of classes.
- Type classes in languages like Haskell provide common behaviors across types, similar to Clojure's interfaces.
- Validation in data-oriented programming uses runtime specs for keys, types, ranges, and regex patterns.
- Fancy type systems in Haskell or Scala reduce the appeal of data-oriented approaches by minimizing type verbosity.
- Dynamic typing critiques often target mediocre systems, while extremes like untagged Tcl lists enable similar flexibility.
- Data-driven programming uses plain data for embedded DSLs, avoiding text, fluent APIs, or macros.
- DSLs like SQL, regex, or JSX solve specific problems declaratively within general languages.
- Fluent APIs chain methods for composability but can be verbose and opaque for inspection.
- Macro-based DSLs in Lisp alter syntax but hide internal representations, limiting introspection.
- Data-driven DSLs represent queries or configs as hash maps, enabling easy generation and manipulation.
- Benefits of data-driven include transparency, immutability, serialization, and spec validation.
- Tradeoffs of data-driven include verbosity compared to custom syntax and potential performance hits from parsing.
- YAML as a DSL syntax, as in Ansible, fails for Turing-complete tasks due to imperative needs.
- Libraries like HoneySQL convert data structures to SQL, allowing dynamic query building.
IDEAS
- Clojure developers intuitively recognize data-oriented programming despite lacking a strict definition, reflecting community zeitgeist.
- Passing JSON-like objects everywhere simplifies programs by avoiding class hierarchies, even in languages that support types.
- Data-driven programming turns metadata into executable DSLs without macros, just by deriving from plain structures.
- Clojure's JVM hosting allows Java object creation, but data-oriented rejects it for primitives to gain flexibility.
- HTTP and database layers in Clojure treat requests/responses as uniform hash maps, decoupling from vendor specifics.
- Generic data unlocks universal functions like map or reduce across types, unlike object-specific methods.
- Losing encapsulation in data-oriented code is overstated, as generic functions remain unaffected by internal changes.
- Runtime specs in Clojure act as optional typing, checking shapes only where needed, like API boundaries.
- Value semantics in immutable data make equality checks straightforward, avoiding reference pitfalls.
- JSON's success proves minimal primitives suffice for rich modeling, influencing Clojure's data choices.
- Clojure's standard library assumes data manipulation, with interfaces enabling iteration over diverse structures.
- JavaScript's mixed mutable/immutable arrays frustrate functional data handling, hindering data-oriented adoption.
- Typed Clojure efforts reveal challenges in retrofitting static types to dynamic lens-like operations.
- Data-driven avoids DSL complexity by limiting expressiveness, balancing power with simplicity.
- Fluent APIs enable runtime query generation but often feel awkward compared to native SQL.
- Macro DSLs provide syntactic sugar but opacity blocks debugging or serialization.
- Representing regex as nested maps allows composition without string hacks, aiding dynamic patterns.
- Routing in web servers as data enables config-driven generation, supporting if-loops for variants.
- UI templates as Hiccup data structures integrate loops and conditions seamlessly, unlike JSX ternaries.
- CSS as data maps exploits composition for mixins, bypassing language inventions.
- GraphQL specs via data (e.g., Datalog) permit fly-time tweaks, unlike rigid query languages.
- Specs as data enhance introspection and storage, surpassing macro-based validation.
- Application pages defined as data auto-generate endpoints and frontend links.
- Routing arrays with middleware functions mix data and code for practical DSLs.
- Super functions via data split preconditions and effects, reusing for AWS Lambda or servers.
- AWS APIs as JSON generate libraries across languages, showcasing data's multi-target power.
INSIGHTS
- Embracing generic data as a program's core language fosters unprecedented composability, turning rigidity into fluidity.
- Dynamic typing's freedom shines when paired with runtime invariants, offering validation without syntactic burden.
- Data-driven interfaces democratize DSL creation, empowering non-experts to generate code via simple structures.
- Tradeoffs in encapsulation reveal that openness accelerates evolution, as changes localize in functional namespaces.
- Immutability transforms data into pure values, simplifying reasoning and enabling effortless reuse across contexts.
- Community norms, not language limits, dictate data-oriented adoption, suggesting cultural shifts in other ecosystems.
- Specs as data bridge dynamic and static worlds, allowing expressive constraints without compile-time rigidity.
- Extremes in typing—untagged lists or dependent types—unlock paradigms mediocre systems obscure.
- Plain data's transparency erodes code's opacity, making systems inspectable and extensible at runtime.
- DSLs thrive on bounded expressiveness; data structures enforce this naturally, curbing complexity explosions.
- Meta-programming via data generation outperforms code-based macros in serializability and testability.
- Applications as data-fed engines modularize concerns, adapting schemas to diverse sub-systems effortlessly.
- JSON's ubiquity hints at universal modeling primitives, scalable from configs to full APIs.
- Functional interfaces unify behaviors across types, mimicking type classes without type overhead.
- Runtime flexibility in validation captures nuances static types overlook, like regex or range checks.
QUOTES
- "Data oriented programming is the choice of using generic data structures and preferring them to objects and typed records."
- "Imagine if we just pass json objects around everywhere in our programs, so we didn't use fancy classes or types."
- "Data driven programming is like choosing that over writing macros or functions or objects or whatever."
- "What if we use json as the DSL everything like what if you needed to write some REG ex What if it was in json."
- "It's the practice generally of just like having these kinds of primitive hash maps that aren't typed."
- "By using these kind of generic data structures, you get access to all of the functions clojure and libraries."
- "Encapsulation a little overhyped overrated and I think you know as a functional programming group we might also be a little bit of that."
- "Data is much simpler and easier to manipulate and compose and generate then code is."
- "The first rule of macro club is don't write macros."
- "With type languages, you have to spend all your time, just like quieting the type system."
- "Everything is data."
- "Let's just use plain data structures as our interfaces into any of these kind of DSL vm."
- "Data structures that don't have behavior that are much simpler than you know methods functions objects."
- "Trivial to manipulate I could insert remove things I can generate this very, very easily."
- "Using data as an interface is awesome like in simple stupid data is awesome."
- "We can then use the same data to deploy to aws Lambda."
- "Represent every single endpoint in there in aws with json. In a declarative way, and then they just generated all those libraries."
- "This data driven this is enabled by this like data oriented practice right."
- "Rip out a bunch of stuff into just data structures that don't have any behavior and then parse those data structures degenerate behavior."
HABITS
- Stick exclusively to Clojure and ClojureScript for web app development over the last ten years.
- Default to generic data structures in all internal communications, avoiding typed records unless essential.
- Validate data shapes at runtime using specs primarily at application edges like APIs.
- Write automated tests to compensate for lack of compile-time type checks during refactoring.
- Use literal syntax for quick creation of hash maps, vectors, and keywords in code.
- Pass plain data between functions throughout the application, treating it as the lingua franca.
- Lean on Clojure's standard library for universal manipulations like mapping or reducing across types.
- Avoid macros, writing only a few over a decade of full-time development.
- Represent domain entities like users as hash maps with IDs and arrays, not classes.
- Generate dynamic structures with if-statements and loops for configs or queries.
- Mix data with minimal functions in practical DSLs, referencing handlers by keywords.
- Define application sub-components like pages or routes as declarative data for multi-target generation.
- Use immutable data for all manipulations to ensure predictable behavior and easy comparisons.
- Inspect and compose data structures at runtime for debugging and extension.
- Push validation to sensitive areas, relying on functional purity elsewhere.
FACTS
- Clojure has been around for 10 to 12 years, evolving from macro-heavy to data-oriented styles.
- Clojure runs on the JVM, enabling seamless Java interop while favoring functional paradigms.
- Ring spec represents HTTP requests as hash maps with keys like :uri, :params, and :headers.
- Clojure includes ratios and UUID literals as primitive types beyond JSON basics.
- Keywords in Clojure can be namespaced, distinguishing them from symbols used for variables.
- JavaScript objects treat all keys as strings, unlike Clojure's type-preserving maps.
- Clojure implements interfaces on primitives, allowing map/filter on hash maps, strings, and vectors.
- Haskell's lenses library required 90% effort for type system compatibility.
- Tcl treats everything as lists, enabling data-driven flexibility without tags.
- AWS represents all API endpoints declaratively in JSON to generate language-specific libraries.
- Clojure's core.async implements Go-like channels using macros for concurrent runtimes.
- JSX requires a superset compiler to intermix HTML and JavaScript.
- Rails routing uses objects to build internal representations for URL handling.
- HoneySQL converts nested maps to SQL without string concatenation.
- Hiccup uses vectors of keywords/strings for HTML, integrating Clojure expressions.
- Garden library compiles Clojure maps to CSS, enabling functional composition.
- Datomic's query language Datalog uses data structures for graph queries.
- Clojure.spec initially used macros but shifted to data for better introspection.
- Compojure's macro-based routing was opaque, limiting dynamic generation.
REFERENCES
- Clojure language and ecosystem, including standard library for data manipulation.
- ClojureScript for frontend web apps.
- Ring spec for HTTP request/response handling.
- JSON as a data representation model.
- Ruby on Rails for DSL examples like routing.
- JavaScript objects and early Node.js.
- PHP as prior development language.
- C++ data-oriented design in video games for data flow optimization.
- Haskell type classes and lenses library.
- Scala for static typing comparisons.
- F# computation expressions and type systems.
- Tcl for untagged lists and dynamic flexibility.
- SQL and regex as embedded DSLs.
- JSX and React for UI templating.
- Core.async library for Go-like concurrency.
- HoneySQL library for data-to-SQL conversion.
- Compojure for macro-based routing.
- Reitit library for data-oriented routing.
- Hiccup for HTML data structures.
- Garden for CSS data compilation.
- GraphQL query language.
- Datalog via EQWL spec for graph queries.
- Clojure.spec for validation, including Malli alternative.
- AWS API JSON specs for library generation.
- Ansible and YAML for declarative deployments.
- Regal library for data-based regex.
HOW TO APPLY
- Identify opportunities in your codebase to replace custom classes with hash maps for entities like users or requests.
- Refactor HTTP handlers to accept and return plain data maps, removing method calls on request objects.
- Introduce runtime specs for validating incoming API data shapes, keys, and value types.
- Use literal syntax to define data structures inline, such as {:user/id 1 :email "example@domain.com"}.
- Apply universal functions like map or filter to data across types, testing composability.
- Generate dynamic queries by composing maps in loops or conditionals before passing to libraries.
- Define routing as arrays of [path handler middleware] to enable config-driven servers.
- Represent UI components as nested vectors, e.g., [:div {:class "container"} [:h1 "Title"]], for templating.
- Validate domain constraints like email regex or ID ranges using spec functions at boundaries.
- Extract application schemas into central data, massaging them for sub-engines like databases or UIs.
- Split functions into data preconditions, effects, and handlers for reuse across deployments.
- Serialize configs or pages as EDN/JSON for storage and reloading in production.
- Test data manipulations by asserting on generated structures before execution.
- Inspect query data for joins or selections using get/keys to debug DSL inputs.
- Compose styles as maps, merging subsets for reusability instead of custom mixins.
- Generate regex patterns by nesting maps, avoiding string interpolation risks.
- Define pages with metadata like {:path "/home" :view-fn render-home :middleware auth}, auto-building endpoints.
- Use data to drive multi-target builds, e.g., same routes for web servers and serverless functions.
ONE-SENTENCE TAKEAWAY
Embrace generic data structures as your program's lingua franca to unlock flexibility, composability, and simplicity in both core logic and DSLs.
RECOMMENDATIONS
- Prioritize plain data over classes in dynamic languages to leverage standard library universality.
- Validate data at runtime edges with specs, avoiding blanket typing for internal flows.
- Default to immutable structures for manipulations, ensuring predictable equality and threading.
- Generate DSL inputs dynamically via data composition, enabling runtime adaptability.
- Limit DSL expressiveness to data primitives, preventing complexity from macro or fluent chains.
- Use hash maps for entities, embedding relations as IDs to flatten object graphs.
- Test refactorings with comprehensive suites, compensating for absent compiler guidance.
- Inspect data transparently during debugging, querying keys/values instead of hidden states.
- Serialize configurations as data for easy versioning and external sourcing.
- Mix minimal functions into data DSLs only when pure structures fall short.
- Architect apps as data-fed sub-engines, centralizing schemas for propagation.
- Avoid macros for DSLs, favoring data for introspection and serialization benefits.
- Adopt value semantics universally to simplify comparisons without deep clones.
- Push validation to sensitive domains, trusting functional purity elsewhere.
- Experiment with data-driven routing for web apps, supporting variant generations.
- Represent queries as nested maps for libraries like SQL or graph engines.
- Reuse data schemas across targets like web and serverless for DRY deployments.
- Critique typing extremes: untagged for fluidity or dependent for precision.
MEMO
In the vibrant ecosystem of Clojure, where functional purity meets pragmatic innovation, Rafal Dittwald, a Toronto-based entrepreneur with over a decade immersed in the language, demystifies two intertwined paradigms: data-oriented and data-driven programming. Far from a fleeting trend, these approaches stem from Clojure's roots on the JVM, a platform steeped in object-oriented rigidity, yet Clojure rebels by championing dynamic, untagged data as the essence of computation. Dittwald, who transitioned from Ruby and early JavaScript to Clojure for startup web apps, illustrates how developers sidestep Java's verbose classes, opting instead for hash maps and vectors—primitives akin to parsed JSON—as the program's connective tissue. This choice isn't mere simplicity; it's a deliberate pivot toward flexibility, where HTTP requests arrive as plain maps via the Ring spec, databases yield unadorned result arrays, and functions thread these structures seamlessly, unbound by type hierarchies.
Yet Dittwald doesn't shy from the frictions. Static typing advocates, like those wielding F# or Haskell, prize compile-time safeguards—refactoring ripples caught instantly, invalid states unrepresentable. In Clojure's dynamic realm, such assurances yield to runtime specs: lightweight validations enforcing shapes, regexes, and ranges at API frontiers, while internal flows rely on immutability and tests. Encapsulation, a pillar of object paradigms, dissolves here; data lies bare, but Dittwald argues this openness accelerates change. Generic functions—map, reduce, filter—operate indifferently across strings, maps, or lists, insulating code from internal tweaks. Audience pushback highlights the divide: How do you trace a key's evolution without a compiler's map? Dittwald counters with functional purity and coverage, noting Clojure's evolution from macro-laden origins to this restrained ethos, where "the first rule of macro club is don't write macros."
Shifting to data-driven programming, Dittwald unveils a meta-layer: using plain data not just for values, but as the scaffold for domain-specific languages. Embedded DSLs like SQL or regex already punctuate general code; why not extend this to routing, UI, or styles? Traditional paths—text strings, fluent method chains, macro sorcery—introduce opacity and rigidity. Strings invite concatenation pitfalls; fluent APIs, while composable, demand awkward chaining; macros veil internals, thwarting inspection. Data-driven flips the script: Represent a SQL query as a nested map {:select [:title] :from :films :join [:actors]}, pass it to HoneySQL, and watch it bloom into executable code. This transparency empowers runtime generation—loop over configs, conditional branches—while immutability ensures safety. Libraries like Hiccup render HTML from vector literals, Garden compiles CSS maps with effortless merging, and even AWS APIs, specced in JSON, spawn client libraries across tongues.
The paradigm's allure lies in its universality. Clojure's literals and interfaces make data manipulation idiomatic, but Dittwald probes broader applicability: Ruby's hashes suffice sans Rails' object bloat; JavaScript objects could ditch classes, though mutable arrays snag. Tradeoffs persist—verbosity trumps custom syntax, parsing may lag Turing-complete needs (Dittwald skewers Ansible's YAML for orchestration)—yet the wins compound. Applications emerge as data orchestrators: Central schemas feed routing engines, page defs spawn endpoints and UIs, super-functions deploy to Lambda or servers alike. In Q&A, F# users envy the extensibility, Tcl hackers nod at list parallels, underscoring a timeless insight: Data's humility begets power, turning code's labyrinth into navigable streams. For developers weary of type wars, Dittwald's anthropology offers a refreshing creed—lean into data, and let behavior derive.
Like this? Create a free account to export to PDF and ePub, and send to Kindle.
Create a free account