Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Aquameta: Web development platform built in PostgreSQL (github.com/aquametalabs)
355 points by dustingetz on Oct 17, 2019 | hide | past | favorite | 91 comments


I think this is an awesome idea. I really like the Smalltalk approach of not using files and instead representing the structure of a program purely in memory. I also love the idea of drawing inspiration from spreadsheets and databases instead of representing programs purely in lines of code.

I applaud this effort and can't wait to try it out!


Any long-lived executable sits in memory (notwithstanding paging, which the database server is also subject to), and such an executable is free to maintain any files it accesses in memory as well.


I think what was being referred to is the technique of defining a program, not as source code files which are then compiled/run in an interpreter, but as an in-memory VM environment (think REPLs). The memory-based environment can then be saved to a file, similar to making a core dump.

This changes the structure of programs from a file/disk based structure, to the potentially richer structure of the programming environment (e.g. graphs of objects). Although it comes with disadvantages like losing the tool inter-operability of files.


I have a hard time seeing advantages of ditching files. I like this concept, but you can have it both ways (come up with a graphical notation that you can then manipulate with graph like tools), but at the end of the day you need a source of truth, and the file metaphor (a named region of code) is hard to beat (impossible to beat?). Even databases store things as files.


File systems are just one approach to storage. Databases can use file systems, block devices, or object-based storage (such as distributed storage systems, e.g. Ceph, Amazon S3). Traditional file systems are, by themselves, inadequate for many use case, as they don't provide things like versioning.

The strength of the file system, is that much tooling already exists for it.


That sounds like doing for code what we used to do for deployments before docker/kubernetes. Like going back from a declarative and reproducible aproach to a more procedural one. There's good reasons infrastructure as code took hold.


I like their meta, and semantics layers.

In the past I've built a scheme similar to their semantics layer, where a bunch of additional metadata is associated with schema metadata via PostgreSQL's COMMENT statement (which lets you associate a free-form text comment with all sorts of schema elements). In that scheme we had JSON COMMENTs and a set of views that ultimately generates a nice JSON representation of a database's schema, including the JSON COMMENTs in the right places, and _that_ gave us UI control that we could then use to generate Admin-on-REST UIs from. And PostgREST can be used to get an HTTP JSON API for free.

For event pub/sub, I've written an used an alternative (all-PlPgSQL-coded) view materialization system that supports live-updating of materialized views as well as recording deltas, which then can be combined with NOTIFYs. Unlike Aquameta, the fact that NOTIFY requires no authorization, and its payload is free-form text, I feel queasy about sending out too much information in NOTIFYs -- instead I use them to drive queries for new deltas as recorded by the delta recorder mentioned earlier in this paragraph. The component that LISTENs for NOTIFYs then writes deltas to a file which is served with a special HTTP server that supports hanging GETs of files -- "tailfhttpd" -- and any HTTP client can then be used to tail these files.

Anyways, the Aquameta scheme is pretty good.

EDIT: I've been tempted to write an authorization-for-NOTIFY patch to PG... I really don't like the idea that if I give someone direct access to the DB they can NOTIFY anything they like on any channel.


Dang some cool ideas in here.

Yeah trying to figure out how to annotate the schema was a big motivator for inventing the meta identifier system. Thought about using COMMENT for documentation, but once you have meta-ids, I thought it was cleaner to just put schema annotations in a different table.

Yeah I think the LISTEN/NOTIFY system in PostgreSQL is a bit primitive. Still working on that part of the project. We got NOTIFYs to propagate up through nginx over a WebSocket and get them into web-world that way, but that section of the project is still fairly immature.


Thanks. You can find some of that code here: https://github.com/twosigma/postgresql-contrib

I need to finish the open sourcing of tailfhttpd though. It's... an open-coded HTTP server, written in C, specifically tailored for tailing files over HTTP. It uses epoll and inotify, and is C10K, and blazingly fast. You have to front it with nginx or envoy to get TLS support, naturally. Because it's so simple, open-coding HTTP/1.1 seemed reasonable at the time, though nowadays writing this in Rust with some reasonable framework would be better.

The key to tailing files over HTTP is to have a server that supports:

  - weak ETags (st_dev, st_ino, generation number)
  - If-Match and If-None-Match conditional request
    support
  - Range requests
  - when the right end of the requested range is
    left unspecified, use chunked transfer-encoding
    and don't send the terminator chunk until
    a) EOF is reached, and b) the file is renamed
    out of the way or unlinked (which the server can
    detect using inotify or similar)
With this clients can tail a file. Heartbeats can only be supported by the application itself -- HTTP/1.1 does not have the ability to send empty non-terminating chunks (HTTP/2.0 doesn't have that problem).

If client loses its tail, it can simply resume by using If-Match and Range starting at the next byte offset after the last byte received before. So recovery is trivial.

And presto, super cheap, simple, and reliable pub/sub.


What I like about using COMMENT with JSON content is that it makes it trivial to extend the schema in powerful ways, and then Aquameta's meta can be stateless views (and instead of triggers). That makes it much easier to swallow meta.


Oh, we actually split up meta into just the views or a additional extension for the update triggers. Yeah the updatable views are a giant foot gun. Essential in our case but not generally.

https://github.com/aquametalabs/meta


Tangential, from the introduction: "Centralized systems are BORING. The early days of the web were honestly more exciting, more raw, more wild-wild-west. We need to get back to that vibe." - I adhere to that philosophy, anyone know of more projects towards that direction?


The long-term idea with Aquameta is that users run their own local database and make connections directly to their friends/peers to basically "git pull" new content as it appears via pub/sub. Then users would do the "browsing" on localhost. I still have some work to do figuring out all that NAT piercing p2p stuff, but some combination of WebRTC and headless chrome is showing some promise.


My favorite NAT-busting mechanism is Tor hidden services, The onion network provides the basic P2P overlay as well, so it keeps life simple. STOMP over WebSockets is cool, and I believe can be implemented without using any Chrome code these days.


Hadn't seen STOMP before, cool. I think making Tor a core part of the architecture is a little over-engineered in our case. The Internet is already a p2p network damnit! I like WebRTC because they seem to solve just the NAT-piercing problem specifically without a whole additional network layer.


I've worked with WebRTC NAT punching pretty extensively, and it is not the solution to your problem. STUN and TURN both require details like proxies with fixed IP addresses, specific DNS and SSL configuration, and so on. So for the nodes to be independent, most users will need to set those services up.

So while the added network layer of Tor (or any overlay network really) certainly adds a level of complexity, from the application standpoint it actually simplifies matters for the following reason: Your application doesn't need any longer to think about the topology of the underlying IP network, as this implicit detail dragged in by direct IP connectivity use is abstracted by the overlay network. Instead, your application is able to interface with any peer via (in the case of Tor) an HTTP proxy and a set of opaque base URLs.

IOW, you're going to need one or more additional daemons to make the P2P part go, and Tor is legitimately the simplest thing available today that accomplishes this without ruling out other styles of overlay network from an architectural point of view.


Interesting.

I don't mind adding additional daemons to the system if it's essential. But, Tor also (as I understand it, given that this is a ways outside my core competency) adds a whole encryption and anonymization layer, which would be undesirable overhead in at least some scenarios, which is why it seems too heavy to be part of core. Maybe WebRTC isn't going to work reliably which would be very disappointing. I'd like to hear more about your experiences and could really use some help making a good decision here. Ping my email if you're available!


Not sure if that would work for what you have in mind, but maybe people in the same 'realm' could all be on the same zerotier network (or maybe tinc, or something similar), and you'd have a firewall only letting the pub/sub traffic through that interface. Might be good enough for prototyping, anyway...


Neat! I was wondering how webRTC fit in. Thanks for sharing, I find outside-the-box stuff like this inspiring.


There are some static generators like Hexo, Jekyll, Hakyll and others - they provide a good mix of using static files and compilation to generate indexes, tags and other features that "dynamic" systems provide. Some setups are really complex, but for personal projects, you can keep it simple.


I'm working on a distributed, decentralized web forum system.

Demo: http://rusrs.com/

It works in modern browsers with some extra JS-based niceties, but is also compatible with just about every browser I've tried, including Mosaic and Dillo.


Storing HTML in a database is wild-wild-west? I thought it is just silly because you can't stream it from disk over a socket.


Wait until you hear about my storage array built entirely on floppy disks.


Punched cards or GTFO


Familiar with these guys?

https://indieweb.org/


I'm going to Twitch stream the Aquameta install and give a little demo, and answer questions folks might have. 5pm CDT. https://www.twitch.tv/events/j5MGrQ91TwSzRpo28wqGdw


Very cool. Does that record? Would like to watch later.


Twitch does not record broadcast unless streamer enable the option. VOD, video on demands, for twitch stay up for 16 to 60 days (depending on sub, turbo, etc). If it's a highlight video it stays indefinately.


VOD is available - starts around 6min into the video.

https://www.twitch.tv/videos/495999132


I cut out the five minutes of me trying to figure out Twitch so here's a better link:

https://www.twitch.tv/videos/496396478


I guess this isn't that dissimilar from Git, except that when you snapshot your environment, you get everything (code, IDE settings, issue tracking, and persistent data) all in the same dump.

There's definitely something to this. I'm not sure that building the IDE on top of it is practical (partly because web IDEs are still pretty limited and partly because the scope is just enormous).

But if we had a way to version code that combines source, data, and issues, I'm super interested. It just needs to pipe into existing tools rather than recreating existing tools.


Call me old fashioned, but I prefer my web app spread across a dozen servers, file systems, hundreds of virtualized processes, and few thousand folders, and employing a small army of system enginneers (sres) and developers to maintain.


Ha :)


This was announced on the Future of Coding Slack [1] not long ago. I encourage people who are into things like structured/projectional editors, visual programming and other wild re-imaginations of programming to join.

[1]: https://futureofcoding.org/community



Unfortunately that upload is only in 240p so most of the jokes are hard to read. I have a copy of the original version which I've uploaded here: https://2by2.info/sql_on_rails_screencast2_lq.mov (VLC can play it).


I uploaded your file to Streamable: https://streamable.com/o5g3f


Experienced something like this early in my career when a co-worker discovered PL/SQL - he campaigned for putting the entire web app into an Oracle database with all the logic in stored procedures.

Experienced a slightly different take a few years later where Some Genius wrote an interpreter in stored procedures in an off-brand RDBMS. Web pages were a combination of HTML and his custom language, and these pages were stored in the database.

I have strong negative opinions about putting a web development platform entirely inside an RDBMS.


> he campaigned for putting the entire web app into an Oracle database with all the logic in stored procedures.

Oracle have already done this, it's called Application Express[0]. I used it for my last job. In practice it was pretty good for fast prototyping and iterating on database-backed apps. It took the schema as the source of truth and would add useful interface features based on it (eg, dropdown lists derived from foreign keys, converting some check constraints to javascript validation code etc).

But on the downside it was essentially untestable and version control meant dumping a massive file of autogenerated PL/SQL and checking it into git. For anything more complicated than basic CRUD and reporting you'd wind up cracking the hood to directly use PL/SQL and then skin it with APEX. The tradeoff being that you lost some of the roundtrip niceties.

I wouldn't recommend it in general, but it fit ... OK in its context. It was included in the Oracle license, the org had experience with operating Oracle and it allowed each of us in a small programming group to build little apps to solve genuine business problems that were either too small to select OSS/COTS or too specialised to find anything to select.

The best part of the job was getting rid of painful processes built around sharing Excel and Word docs. I remember occasion when a user broke down in tears because a (to me) trivial app meant that a stressful, high-pressure, error-prone month-long process turned into a non-event. One of the best days of my career and I partially owe it to a platform that I think is, in most respects, a mistake.

[0] https://apex.oracle.com/en/


This was before an official release, late 1990s. I'm sure he'd gotten his hands on either some documentation discussing the concept, or an early alpha release of Oracle's ApEx.


Might have been Oracle Web DB, it predated the APEX line but was roughly the same concept and released in the late 90's.


Oh very possibly.

Wikipedia reckons HTML DB (the original package) was released in 2004, but the APEX website clear claims to being 19 years old, which would put it circa 2000.

But in the meantime if you're doing late-90s CGI with Perl, why not `SELECT my_amazing_function()`?


Hey, buddy, we were cutting edge with the brand-spankin-new ASP! (Although, we did have a Perl guy who insisted we'd be renaming all our .asp files to .aspl because MS had 'opened up' the scripting engine to take whatever language you'd want to write an engine for ... and a Perl engine existed at that moment.)


Not surprising. Databases have been basically awful at being approachable from the command-line. Let's say I have some HTML in some field in the db and want to make a change to it from the terminal. How do I do this?

PostgreSQL says use an UPDATE statement, but that looks like "update template set content='<html><body>......</html>' where id = x", which means you have to paste the entire content of the page into the update statement. There's no CL-based field editor, just one of many reasons why the db feels like a black box. Awful.

Aquameta has a filesystem integration layer that lets you browse the database from the command line, grep database content, edit the content of a field using your preferred text editor. http://blog.aquameta.com/intro-chpater2-filesystem/

Generally just trying to ease these pain points has been a big goal of the project... still a lot to do.


That is incorrect because PostgreSQL has XML operators.


You're not the first to have "strong negative opinions" about putting business logic in an RDBMS.

Now, that's not quite what you said! You're against "putting a web development platform entirely inside an RDBMS", and this much I agree with.

But you seem to object to putting business logic in the RDBMS as well, and I want to tell you why I don't agree with that.

The problem with not putting business logic in an RDBMS is that you risk ending up with an application which evolves towards... implementing features already provided by the RDBMS.

You also need to be very careful with direct DB access, since anything you do there will not be implementing a lot of your business logic. This means you can't use things like PostgREST to get a REST API for free. Conversely, if you do want to use PostgREST or alike, you must move your business logic into the RDBMS.

Now, I understand not wanting to do this with Oracle. There are many reasons for that. And I understand not wanting anyone to write an interpreter in a PL/SQL! I also understand not wanting static HTML and other such resources stored in the DB -- there's nothing wrong with using a filesystem and $FAVORITE_VCS for that, and indeed, the version control in Aquameta is something I don't quite approve of. But none of that says you shouldn't have business logic in the RDBMS.


I don't disagree particularly with any of this, and I'd say an immature part of Aquameta is where to put the business logic. I think though that Aquameta's approach is somewhat agnostic to this whole problem. As a tool right now, it's good at CRUD and a free REST API (that reflects the schema) and rapid UI development, but the best pattern for what you would typically put in an ORM isn't figured out. But, the approach really doesn't constrain adding a middleware layer if one is so inclined, or alternately putting all that logic in the database. I think there's a place for that for sure. As long as that place is somewhere in the database. :)


For a good overview of what power and speed you can have by "putting a web development platform entirely inside an RDBMS", check out this video called "Twitter-like App in 20 minutes with Oracle APEX"

https://www.youtube.com/watch?v=bosMSghczUo


I'm not sure its "should I do this" but more "can I do this".

PL/SQL is really a powerful language and entirely capable of this. It is not a language who's power is best utilized for this -- it is a maze to encode (much) business logic within the DB.


Probably better to use PL/V8 these days.


> I have strong negative opinions about putting a web development platform entirely inside an RDBMS.

What about your experiences led to your negative opinions?


Off the top of my head:

* version control

* environment mismatches (think managing dev, staging, prod with multiple devs)

* deployment management

* discoverability of where things live so you can maintain them, and autocomplete in IDEs

The last is the big one. If a client asks me to change the header, and me going to my IDE and quick searching "header" doesn't automatically show me all header related files/code then your platform has already lost the developer experience battle. Something like this would need tooling to match what is already possible in that regard.


While I agree with the overall negative opinion of overzealous stored procedure usage, to echo your last sentence, I've come to realize many of these problems are completely solvable with tooling. A personal anecdote, and please excuse the self-plug:

I'm the author of an infrastructure-as-code tool for MySQL/MariaDB schema management, Skeema [1]. Basically it allows you to store CREATE statements in a repo, and you can diff/push/pull between the repo and live DB environments. Originally the tool only supported tables, as my personal MySQL experience skews towards social networking companies that banned stored procs outright.

Earlier this year a company generously sponsored development of stored proc/func support in Skeema. And although I've long been a stored proc skeptic, I have to say my opinion has softened considerably after building this functionality. It essentially solves the first 3 bullets you've listed here: allowing storage of CREATE PROCEDURE statements in a git repo; ability to diff the current state of the repo against any live db environment; ability to push the current repo state to any live db environment. And the 4th bullet just depends on your IDE's support for different SQL / T-SQL / PLSQL dialects.

I still have some scalability and maintainability concerns around extensive use of stored procs (especially in MySQL/MariaDB), but nonetheless found this experience to be unexpectedly eye-opening. My perspective went from "stored procs are an operational nightmare, avoid" to "huh these are actually quite useful and totally manageable given a solid development/deployment story."

[1] https://www.skeema.io


Tooling on databases has been traditionally awful, and I think that has a lot to do with folks' negative experiences with the database taking up more architectural space.

Schema migration is hard. Skeema looks cool. Aquameta's idea with schema migration is to treat the schema as data, so that each table is a meta.table table, each view is a row in the meta.view table, each column... etc. So when version control checks out a new version of the project, if columns say were added, they're just inserted at runtime. Checkout prioritizes meta entities so they happen first and in a sensible order [1] so the migration happens first. This does technically work for all scenarios (I think!), but in the case of say a column rename, it would naively delete all the values and then have to re-set them per row, which could be a big inefficiency. It would still work just fine for smaller tables though. Alternately some kind of optional migration script per commit was another idea.


Intriguing product - and amazing work! How do you handle column renames and more complicated data migrations (e.g. split "full_name" on whitespace)? Django's migration framework handles these by having you check into your VCS each "diff" in the migration DAG, and providing enough metadata to know how to go forward and back, but this also means that to change VCS branches you need to roll back migrations in a highly counterintuitive way. If there was a way to have a checked-in declarative schema, with metadata to indicate how data migrations should work so you can just `git checkout; skeema push`, it would be a gamechanger.


Thanks! Great questions -- unfortunately the short answer to both currently is "not supported yet" :) But there's reasoning behind it being on the back-burner in both cases; in some older threads I've discussed the lack of renames [1] and data migrations [2][3].

[1] https://news.ycombinator.com/item?id=19882611

[2] https://news.ycombinator.com/item?id=20873608

[3] https://news.ycombinator.com/item?id=19882862


We’ve built something very similar for PL/pgSQL. It integrates with Gradle so we even have transitive dependencies on different schemas. It has completely changed our approach to software development - development is faster and the application is literally an order of magnitude faster.

The key was to build the tool up front. Without the tool we couldn’t have done it.


Indeed these were the major downsides.

Further, in my Oracle instance, relying on the late-1990s Oracle database server to also run our business logic came with performance and scalability concerns, especially around licensing.

In my other case, the custom language was terrible, but that's not a complaint about the use of the db engine. The interpreter was implemented in stored procedures, the pages were stored in text fields ... every bit of the stack depended entirely on this off-brand database that had its own issues (like corrupt indexes that needed repairing about once per month.)

I think ultimately, in both cases it just felt so much like putting all your eggs in one basket and then hoping you never needed to augment the basket with additional baskets, nor replace the basket with one of a different shape or made of other materials.


Localization alone can make a schema complex as well. One of the projects I'm on took on a lot of configurations managed in the database, and nearly all the backend logic in stored procedures. Discoverability is a huge issue, and nobody really understands the schema as a whole. The project itself is only just over a year old with a team of 6-8 devs. In the end, some like it a lot, others not so much, and it's all friction to development.


Well, just glancing at that source code: I do not know Postgres like I thought.


Well, to be fair, you can do all sorts of things with Postgres if you build custom extensions for it


Aquameta has been the life project of Eric Hanson for close to 20 years off-and-on. Functional prototypes have been developed in XML, RDF and MySQL, but PostgreSQL is the first database discovered that has the functionality necessary to achieve something close to practical, and huge advances in web technology like WebRTC, ES6 modules, and more have shown some light at the end of the tunnel.

Aquameta is an experimental project, still in early stages of development. It is not suitable for production development and should not be used in an untrusted or mission-critical environment.

Not really the basis for ’reference success stories’ approach to evaluation. More, check out the repo and hack at Postgres, see where the hypothesis can be made pragmatic.



Their argument from the introduction that software can slow down a business by making changing/evolving the systems difficult and requiring programmers at every step, is an interesting one:

http://blog.aquameta.com/introducing-aquameta/

Obviously this is has been the panacea goal in business software forever and there's a long trail of failed companies or projects who tried to do this.

IMO it's always going to require specialized knowledge of a certain level of abstraction above the machine, it will never be as simple as pushing buttons in a GUI. The question is making the languages/frameworks simpler and lowering the bar.

In practice these sorts of things work great for simple scenarios but are very brittle one you start to. Just like Excel spreadsheets used like databases it will quickly turn into a hacky maze of things forced into places where it shouldn't be.

I'd also never think 'PLpgSQL' when it comes to simplifying things. I'm genuinely curious to see if they can pull it off... eliminating the file system part is also an interesting idea.


It's not that using PlPgSQL simplifies things, but that putting business logic in the RDBMS does simplify things. It does so by letting you use an ecosystem of tools you otherwise could not use. There are several tools that let you serve a DB via any number of protocols, including RESTful APIs, for example. You could even give out direct SQL access to power users and not have to teach them how to maintain referential integrity.

PlPgSQL is not a great language, and that's obviously not a reason to use it. But it's not obviously a reason to not use it, and more than that, it's no reason not to put business logic in the RDBMS. Once you decide to put business login in the RDBMS, PlPgSQL is just one of several languages you could use, and actually the most accessible one.


The MO of SV startups seems to be to slap together something that has product market fit while ignoring performance and scaling concerns, at first. Once you have enough customers and market, you then raise funding and use that to hire more experienced system folks who can make your system performant, efficient and scalable.


I wasn’t talking about scaling performance wise. I mean scaling the problem set beyond the initial easy ones.


Is this comparable to Oracle Apex, which is a web development tool written entirely in oracle PL/SQL?


Used Apex when it first came out (as HTMLDB). I’ve never found anything that compared to its speed in rolling out web based forms for internal use. At the time, it was the best BI tool for ad hoc queries around. Nowadays I use metabase, but I’ve often missed Apex speed and wished PG had it.


I had to think about Oracle Apex as well. Back when I started developing software, it blew my mind, that I don't have any code files I can see anywhere. I still don't know why anybody would use Oracle Apex for slightly bigger projects. It just seems so abstract.


Cool concept!

Has anyone built anything in Aquameta? Would like to see some success stories.


I worked on Aquameta for a few years and I'll say that having a build running locally is a joy I've never encountered elsewhere in my career.

Every time I need a little app or something, instead of reaching for some SAAS product, I would just build a quick prototype for myself. I could make a prototype in a couple hours and over the course of a week or two I would polish it up when I had a few extra minutes. Rather than using something built for the masses, I could tweak the interface to my own taste and I owned all the data.


No. :) It is still in early stages. I use it for my projects but I end up dropping to the PostgreSQL prompt occasionally and doing things nobody but me would know how to do. Every time I do though I try to fix the source of the problem. Really need to start putting up some example projects though. I'd say it's ready for very early adopters.


Greenspun's tenth rule may need to be updated for IDEs:

"Any sufficiently complicated C or Fortran program contains an ad-hoc, informally-specified, bug-ridden, slow implementation of half of a JavaScript IDE." (previously Common Lisp)


> Technical goals of the project include: > Allow complete management of the database using only INSERT, UPDATE and DELETE commands (expose the DDL as DML)

There might be other reasons - but note that postgres actually supports transactions and rollback of DDL - Oracle with full enterprise lisence has something similar (there's undo/a "trashcan" style cache for schéma modifications).

But in general (as far i understand; I have not used this in anger) pg lets you simply BEGIN drop table (...) ALTER table (...) (who's, something wrong) ROLLBACK.

It's very neat, and not generally supported.


Yes, yes, but, it's nice to be able to treat schema as relational data itself. Indeed, the information_schema and pg_catalog already let you do that, but the information_schema is incomplete, and the pg_catalog is hard to use and unstable (that is, each release can change the pg_catalog backwards-incompatibly).

Besides letting you read schema via normal SELECTs on a well-designed meta-schema, something like Aquameta can let you run DMLs as DDls too. That means you can now generate and execute DMLs dynamically but without needing EXECUTE -- you can just have a bunch of normal INSERT/UPDATE/DELETE statements (including via CTEs) and generate schema from other data the same way you'd generate data from data. I think that's a big plus. But then I've worked with a database before that did this schema as data-in-a-meta-schema thing, and I find it very comfortable.

For example, without a metaschema you have to use CREATE THING IF NOT EXISTS then ALTER THING ... in order to apply schema changes. Whereas with something like Aquameta you just INSERT INTO ... WHERE NOT EXISTS ... (or with ON CONFLICT DO ...) and that's that. You can have one set of DDLs that create schema, and the same set of DDLs also updates schema. That's wonderful.


I'm still not entirely clear on why this is a good thing. Is it a preference - sort of hygienic unhygienic macros - for data structures?

Do you end up with old data in old schema - or is there some implicit conversion? Is a UPDATE that drops a column equivalent to dropping a column - and can you roll it back normally?


It's like functions that return functions: SQL that generates SQL. Normally, in PG land, one generates code using SQL, using the format() function, then one EXECUTEs it. That would still be the case for generated FUNCTION bodies, naturally, but for everything else you could just use DDLs on a small and well-tested metaschema that does this work for you.

Just as in Lisp you might rather not use eval but instead use macros, here you'd prefer to use DDLs on a metaschema over DMLs. Hygiene is one of the motivators.

And yes, you could DELETE from a view that represents columns instead of using ALTER TABLE ... DROP COLUMN ... All the usual transactional semantics apply. You can BEGIN, delete that row, and rollback, leaving the DB unchanged.


Yes almost all DDL is transactional in PG, and it works as you say. You can create tables, drop indexes, replace functions, update views - and then roll it all back.

There are a very small number of DDL statements that you can’t do in a transaction, but the devs are fixing them (adding a value to an enum is the only one that comes to mind)

We use it to great effect when upgrading schemas.


> adding a value to an enum is the only one that comes to mind

Adding enum values in a transaction is now supported in PG 12 as long as you don't try to use the new value in the same transaction.


"Under the hood, Aquameta is a "datafied" web stack, built entirely in PostgreSQL. The structure of a typical web framework is represented in Aquameta as big database schema with 6 postgreSQL schemas containing ~60 tables, ~50 views and ~90 stored procedures. Apps developed in Aquameta are represented entirely as relational data, and all development, at an atomic level, is some form of data manipulation."

This is triggering painful flashbacks to SharePoint development.


What were the painful parts of SharePoint development?


What were not the painful points of SharePoint development?


I don't know. That's why I'm asking.


I am impressed they've implemented so much. My main point of critique is that your app is not just entirely relational data, but relational data plus about 1.5M lines of PostgreSQL traditional code :)


reminds me of couchdb apps where you'd use document attachments to store html/public files and serve them directly from couch.

https://docs.couchdb.org/en/2.0.0/couchapp/


This is actually really similar to on-premise SharePoint development; list data, content types, workflow definitions, front-end HTML, CSS, JS code are just stored in the back end database.


Data driven applications are cool, I think I'd use Datomic over Postgres but it didn't exist 20 years ago


Make sure you back up that database...


Lol you have a great advice over there


This greatly reminds me of my experience with AEM (Adobe Experience Manager). Web-based IDE, version control via OSGI bundles, content repository via Jackrabbit.

I did not enjoy my time working with it but it had some neat ideas and did do some things well.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: