Pellucid data

1/30/2024

Even in SQL this corresponds to the “where” clause. The ordering can be tweaked to put the most restrictive matches first, allowing the client to optimise the query). (The parentheses around the matches indicate that this is a list of matches rather than a set. The ldp:match section in the above query would get compiled down into the “where” clauses of a Datalog query for all owners that had an ID attribute with any value and a styleRef/description attribute which equals “Growth”. (As a side note, one of the reasons RDF is preferable to JSON as a data interchange format is that it represents a graph a more general data structure than JSON’s tree). So what we expect to get in response are owner entities and entities that they are related to. A container in the Linked Data Platform is the same as the root URL of a resource in familiar REST semantics and conceptually represents a table in a data silo. The query is executed by POSTing it to the owner’s container (/owner/). SELECT o.id, is.description, o.orientation Ignoring the representation for a second, this query is essentially pulling the “investing orientation” and “investment style” description from a database of investment managers, or “owners”, and is the equivalent to the SQL query: The above is a Turtle serialisation of the query represented as RDF data. We came up with a slightly more compact representation of the query: Hand waving over a lot of the details, it should be easy to intuit that if we had this mapping defined somewhere, we can quite easily represent a Datalog query and the results of that query in RDF. This brought us close to Goal 2, but not quite all the way there. The other important property that falls out of this mapping is that the “where” clause of Datalog queries, also being a set of triples, can quite naturally be represented in RDF with some encoding of variables. Primitive Values that Datomic supports can be represented quite easily with RDFs data types. Attribute keywords map into Predicate URLs. Entities identified by unique Long values map into unique Subject and Object URLs. There is a natural mapping between Datomic facts and RDF triples. Only they call the triples facts and their components: Entity, Attribute and Value. Conveniently, Datomic, in addition to being many other things, is also a triple store. In the vocabulary of RDF, a triple is a Subject, Predicate and Object. If you haven’t come across it yet, RDF is basically a representation of a graph as a set of triples. Let’s do a little bit of a technical deep dive for a second and consider the RDF format, the language of linked data or as it’s wiki page puts it “a standard model for data interchange”. What’s the connection? Wait for it… The RDF Datomic Mapping A W3C project, the need for which is brilliantly covered in this talk. And Goal 3 hinted at the need for some way to talk about things in different data silos using a common vocabulary.

Goal 1 meant that we’d have expose quite a bit of Datalog expressivity to be able to write all the queries we needed. Goal 2 was easily addressable by conveying data over a more accessible protocol. We’d already ingested the data set from the raw file format into Datomic, which we love. We embarked on a project which we thought might move us in those three directions simultaneously. These are very orthogonal goals to be sure. Support staff should not have to learn Datomic’s highly expressive query language, Datalog or have to set up a Scala console to look at the raw data that was being served up.ĭifferent data sets that we use had semantically equivalent data that was being accessed in ways specific to that data set.Īnd as a long-term goal we wanted to be able to query across data sets instead of doing multiple queries and joining in memory. We were starting to come to terms with the difficulty of answering support questions about the data we use in our charts given that we were serving up the data using a Finagle service that spoke a binary protocol over TCP. This one was a high-dimensional data set and we were certain that the queries that would be needed to make the charts had to be very parameterizable. Recently at Pellucid we were faced with three concrete problemsĪdding a new data set to make data visualizations with. The problem of understanding this data, and providing access to it for our application is something that we (and many others) have had to solve over and over again. Exposing Resources in Datomic using Linked Dataįinancial data feeds from various data providers tend to be closed off from most people due to high costs, licensing agreements, obscure documentation, and complicated business logic.

0 Comments

Pellucid data

Leave a Reply.

Author

Archives

Categories