The next generation relational database.

Build Status Join GitHub discussions

What is EdgeDB?

EdgeDB is an open-source object-relational database built on top of PostgreSQL. The goal of EdgeDB is to empower its users to build safe and efficient software with less effort.

EdgeDB features:

  • strict, strongly typed schema;
  • powerful and expressive query language;
  • rich standard library;
  • built-in support for schema migrations;
  • native GraphQL support.

Check out the blog posts for more examples and the philosophy behind EdgeDB.

Modern Type-safe Schema

The data schema in EdgeDB is a clean high-level representation of a conceptual data model:

type User {
    required property name -> str;
}

type Person {
    required property first_name -> str;
    required property last_name -> str;
}

type Review {
    required property body -> str;
    required property rating -> int64 {
        constraint min_value(0);
        constraint max_value(5);
    }

    required link author -> User;
    required link movie -> Movie;

    required property creation_time -> local_datetime;
}

type Movie {
    required property title -> str;
    required property year -> int64;
    required property description -> str;

    multi link directors -> Person;
    multi link cast -> Person;

    property avg_rating := math::mean(.<movie[IS Review].rating);
}

EdgeDB has a rich library of datatypes and functions.

EdgeQL

EdgeQL is the query language of EdgeDB. It is efficient, intuitive, and easy to learn.

EdgeQL supports fetching object hierarchies with arbitrary level of nesting, filtering, sorting and aggregation:

SELECT User {
    id,
    name,
    image,
    latest_reviews := (
        WITH UserReviews := User.<author
        SELECT UserReviews {
            id,
            body,
            rating,
            movie: {
                id,
                title,
                avg_rating,
            }
        }
        ORDER BY .creation_time DESC
        LIMIT 10
    )
}
FILTER .id = <uuid>$id

Status

EdgeDB is currently in alpha. See our Issues for a list of features planned or in development.

Getting Started

Please refer to the Tutorial section of the documentation on how to install and run EdgeDB.

Documentation

The EdgeDB documentation can be found at edgedb.com/docs.

Building From Source

Please follow the instructions outlined in the documentation.

License

The code in this repository is developed and distributed under the Apache 2.0 license. See LICENSE for details.

Owner
EdgeDB
The next generation relational database.
EdgeDB
Comments
  • restore is not the inverse of dump

    restore is not the inverse of dump

    • EdgeDB Version: Docker image: edgedb/edgedb:1-beta1
    • OS Version:

    Steps to Reproduce:

    1. edgedb dump backup --all --format=dir
    2. edgedb restore -v --allow-non-empty --all backup

    Result:

    edgedb error: error applying init file "backup/init.edgeql": failed statement "# DESCRIBE ROLES\nCREATE SUPERUSER ROLE `myuser` { SET password_hash := \'SCRAM-SHA-256$4096:xxxxxxxxxxx=\'};": ERROR: SchemaError: role 'myuser' is already present in the schema
    

    I would expect that if I put --allow-non-empty, especially with --all, I would be allowed to do such a thing. Is it as intended?

    I feel like there should be a way to cleanly "start fresh from a full backup".

  • Consider renaming local_datetime back to naive_datetime

    Consider renaming local_datetime back to naive_datetime

    A few questions regarding our handling of date/time came up in our internal Slack recently:

    • I would like UTC timestamp without any timezone; should use local_datetime?
    • Or should I use datetime and expect nobody messes up with timezone?
    • If I need a timestamp, like expiration time of something. Should this be a timezone-aware datetime?
    • Should I assume it's UTC when returned from database or should I enforce (convert) timezone when formatting?
    • I would expect local_datetime to be timezone-aware.

    Since this is quite a complex topic I'd like to start by explaining how EdgeDB works with date/time compared to Postgres.

    EdgeDB

    In EdgeDB there is a timezone aware type std::datetime and a naive date/time type std::local_datetime.

    std::datetime is stored internally in UTC and binary-wise it's identical to std::local_datetime. The fact that std::datetime values are time zone aware is encoded into their type. E.g. when a user receives a std::datetime value they:

    • Know that they are dealing with a time zone aware date/time.

    • If they want to render this date time to string in their front-end code they have to specify the target time zone. E.g. to render a std::datetime value to a string via EdgeQL the user would use std::to_str function that would require the user to pass the timezone they want the time to be rendered in. Usually that timezone would be the timezone of the current user that's interacting with the date.

    • EdgeQL is strict w.r.t. separating std::datetime from std::local_datetime. For instance it's not possible to subtract std::datetime from std::local_datetime — the operation is undefined. Subtracting std::datetime from std::datetime or std::local_datetime from std::local_datetime is allowed and would produce a std::duration.

    • When a user submits a std::datetime value to the database it's their responsibility to let the DB know how to convert it to UTC. There's no magical config setting (unlike in Postgres) that would automatically let naive date/time be converted to time zone aware date/time.

    Here's a couple of use cases to understand the naming motivation:

    • datetime is a globally recognizable point in time. Example: space shuttle launch time; file creation time; when a log or audit record was created; precise moment of when a financial transaction has been committed.

    • local_datetime is something that makes sense regardless of time zone. Examples: milk expiry date; alarm clock wake up time; arrival time on air/train tickets is always in local time of the destination.

    💡 Maybe we should add this ^^ to the docs.

    PostgreSQL

    Quoting Postgres documentation:

    PostgreSQL endeavors to be compatible with the SQL standard definitions for typical usage. However, the SQL standard has an odd mix of date and time types and capabilities.

    There are two types in Postgres that are interesting for us in the context of this discussion: timestamp and timestamptz (it's an alias for SQL's timestamp with time zone):

    • timestamp is used to store naive date/time, e.g. when time zones aren't applicable or not known.

    • timestamptz is used to store time zone aware date/time. It is stored on the server in UTC. As in EdgeDB, in Postgres, they are identical values when stored on the disk, meaning that the difference between the two is purely their type id (OID).

    So far so good, Postgres date/time handling looks just like EdgeDB. The key difference is that in Postgres there's a server config setting TimeZone that is used to automatically convert time zone aware and naive date/time to each other. Examples:

    • if a user submits a text date/time without timezone to timezonetz cast Postgres will use the config setting and convert it silently. When the user queries the submitted value back, it might be different from what was submitted because of that transformation, leaving the user confused.

    • if a user subtracts a timezone value from a timezonetz value, Postgres will use the config setting to silently convert timezone to timezonetz. Again, this can lead to confusion, as the result would look totally different from what a naive user assumed.

    EdgeDB does not have automatic conversion. To submit a std::datetime value a time zone must be specified along with it. Implicit mixing of time zone aware and naive date/time values is prohibited.

    Q&A

    • I would like UTC timestamp without any timezone; should use local_datetime?

    You should use std::datetime. Just point out to EdgeDB that it's already in UTC.

    • Or should I use datetime and expect nobody messes up with timezone?

    You should indeed use std::datetime. You should make sure that your frontend code renders your date/time values either in: local user's time zone; or in UTC. The format is up to you and your UI requirements, EdgeDB does not care.

    • If I need a timestamp, like expiration time of something. Should this be a timezone-aware datetime?

    Depends. If it's the expiration time of a carton of milk I'd say you should use std::local_datetime. If it's the expiration time of evaluation period of your SaaS offering it should be std::datetime -- that way the user will know the precise time no matter where they currently are, in London or in New York.

    • Should I assume it's UTC when returned from database or should I enforce (convert) timezone when formatting?

    For std::datetime — yes!

    For std::local_datetime — it can be whatever, the database does not know. Only the developer knows and can use it correspondingly.

    The rule of thumb here is that if the date/time value is meant to be rendered using different time/zones it should always be std::datetime.

    • I would expect local_datetime to be timezone-aware.

    Yes, I can see where the confusion is coming from. See the💡-paragraph section above for our motivation of naming it that way.


    We might want to consider renaming local_datetime back to naive_datetime. At the very least our documentation must be improved.

  • Add `module` block to SDL

    Add `module` block to SDL

    The current version of SDL and migrations does not allow expressing the migration of interdependent modules because a single SDL blob is only supposed to reference one module. The solution to this is to force SDL to express the entire user-schema in the context of migration. This means that a single SDL block must denote unambiguously which modules the various entities belong to. There are two mechanisms proposed for that:

    1. Fully-qualified names for all entities (except ones belonging to std, since that's always built-in in all contexts).
    2. A special module <name> { ... } block. All inner SDL inside this block is understood to have the default module <name>. It is possible to have multiple module blocks with the same name in a single SDL document.

    The total list of modules in the schema is inferred from module block names and from fully-qualified names.

    Example of SDL using module blocks:

    module default {
       # the default module here is "default"
       type Foo {
          # the module "std" is also part of the default name 
          # resolution as usual
          property name -> str;
          link bar -> other_mod::Bar;
       }
    }
    
    module other_mod {
       # the default module here is "other_mod"
       type Bar {
          link bar -> Bar;
          link foo -> default::Foo;
       }
    }
    

    And the same schema expressed using only fully-qualified names SDL:

    # only the "std" module can be omitted here, all other 
    # names have to be fully-qualified, even "defaut"
    type default::Foo {
       property name -> str;
       link bar -> other_mod::Bar;
    }
    
    type other_mod::Bar {
       link bar -> other_mod::Bar;
       link foo -> default::Foo;
    }
    

    The following schemas are also equivalent:

    # Multiple module blocks with the same name
    module default {
       type Foo;
    }
    
    module other_mod {
       type Bar;
    }
    
    module default {
       type Foo2;
    }
    
    # Each module block has a unique name
    module default {
       type Foo;
       type Foo2;
    }
    
    module other_mod {
       type Bar;
    }
    
    # Mix of fully-qualified names and module blocks
    type default::Foo;
    
    module other_mod {
       type Bar;
    }
    
    module default {
       type Foo2;
    }
    
  • Edgedb schema import/export

    Edgedb schema import/export

    Is it possible to load(import migrations from file) or to save(export) migrations to .esdl file? As far as I seen, not yet. I want to implement this feature, but I need to discuss how would it work first. My idea is to integrate it to cli like so: edgedb schema export file_name.esdl to export current database schema to file edgedb schema import file_name.esdl to import to database Where importing would just run CREATE MIGRATION {file_name} TO {file_contents} And exporting would save all the schema? From what I can see, there is only CREATE MIGRATION and GET MIGRATION, is it possible to get current schema or list of migrations via edgeql?

  • Disallow DELETE and UPDATE in many cases without FILTER

    Disallow DELETE and UPDATE in many cases without FILTER

    This patch errs on the side of disallowing, since it is pretty easy to workaround.

    We allow DML when the source is filtered, comes from a query that is not a set reference, when it is a FOR iterator, and when it is the set being inserted in the ELSE clause of INSERT ... ELSE. (FOR is allowed because the meaning seems to clearly imply applying to everything ...and because it made the implementation easy.)

    Most of the test cases for things we allow are existing ones.

    I can follow up with a documentation update if there is agreement on this one.

    Closes #1742.

  • Naming conventions for scalars and object types.

    Naming conventions for scalars and object types.

    Currently there's a convention in EdgeDB to name scalar types using lower_case with an added _t at the end. Object types are named using CamelCase.

    The _t is meant to easily distinguish scalar type names and other lower-case names like functions and links. Abstract links are especially important here since they exist in the same namespace as the scalar types and can't use the same name. Although the problem of clashing names is a real one and needs to be solved, adding _t is somewhat awkward as a solution. In the general case of all scalar types it may be that "give better names" is the only generic reasonable advice.

    However, there's a subset of scalar types that may have a different solution - enums. We could have a convention that the enum type names should be ALLCAPS to kind of remind what these types are. Using ALLCAPS for special constants is common practice in some programming languages and enums are conceptually similar to specialized constants.

    Object type names should probably keep using CamelCase.

    Whatever convention we agree on, should be reflected in our own built-in libraries. Incidentally, we still have some remains of "old-style enums" that emulated enum behavior by using a one_of constraint. The good news is that we only have things that are functionally enums that follow the _t scalar naming, other built-in scalars just have plain readable names.

  • SELECT Object IF EXISTS Object ELSE INSERT

    SELECT Object IF EXISTS Object ELSE INSERT

    Hi, I tried to use the IF...ELSE and EXISTS clauses in order to construct a query that either selects an item if it exists or else inserts it. So since I can do the following:

    SELECT 1 IF EXISTS (SELECT Person FILTER .name = 'Ryan Gosling') 
    ....... ELSE (SELECT 0);
    

    ...I tried to do

    SELECT Person FILTER .name = 'Ryan Gosling' 
    ....... IF EXISTS (SELECT Person FILTER .name = 'Ryan Gosling') 
    ....... ELSE (SELECT <int64>{});
    

    However I get the error: QueryError: operator 'std::IF' cannot be applied to operands of type 'std::bool', 'std::bool', 'std::int64'

    Am I making some syntax error or is this kind of selection not allowed? I assumed it would work because the documentation on IF states anytype as the left assignment, but I have not been able to figure out how to make it work with my schema types.

    Thank you very much for your help.

  • Inconsistency with volatile calls

    Inconsistency with volatile calls

    1. FOR x IN {1, 2, 3} UNION (x, random()); returns identical values
    2. FOR x IN {1, 2, 3} UNION (SELECT (x, random())); returns different values
    3. SELECT (FOR x IN {1, 2, 3} UNION (SELECT (x, random()))); returns identical values (!)

    I /think/ that all 3 ought to return different values but 2 and 3 definitely should do the same thing.

  • Require a FILTER on DELETE and UPDATE?

    Require a FILTER on DELETE and UPDATE?

    There was some discussion earlier about requiring FILTER on DELETE and UPDATE statements, to help prevent mishaps. FILTER true could be used when operating on everything is actually desired.

    (Relevant: https://www.youtube.com/watch?v=i_cVJgIz_Cs)

  • Consider removing `local_datetime`

    Consider removing `local_datetime`

    Idea

    Make a single datatype for a timestamp which is basically UTC datetime.

    Motivation

    1. There are very little use cases (if any) of the type
    2. It's a source of lots of errors
    3. If you really want it, think twice and use two fields for date and time (i.e. date when alarm first started and time when it should ring for a recurring alarm, instead of cobining the two)

    What other databases are doing?

    Terms:

    • naive -- timestamp without timezone specified (for example 2019-11-25 22:10:42 if written as string)
    • unix timestamp -- specific representation of datetime in seconds since the unix epoch (epoch is in UTC so the whole value doesn't depend on the time zone)
    • UTC-normalized timestamp -- means datetime stored as UTC internally being it unix timestamp or timestamp at UTC

    Databases:

    • PostgreSQL has TIMESTAMP which is documented to be UTC but accepts a naive timestamp as an input, so easy to misuse. And TIMESTAMPTZ which accepts any timezone offset as input and converts to UTC timestamp internally. Converts to client session time on output.
    • MySQL has TIMESTAMP which accepts a naive timestamp in local time and does autoconversion to UTC for storage. DATETIME is just a naive timestamp without conversion
    • VoltDB uses UTC-normalized TIMESTAMP. As long as I understand it has Java API and it's hard to mess with timezones (because it's converted from java.date.Date that is basically a unix timestamp)
    • MongoDB has UTC-normalized timestamp named Date (as far as I understand when converting to/from a string they only accept Z-suffixed format 2019-11-25T22:10:42Z). Name probably inherits JavaScript type name
    • Clickhouse use naive format that is converted to/from local time, named DateTime (basically like TIMESTAMP in mysql)
    • FoundationDB (in document layer) basically supports MongoDB semantics
    • RethinkDB actually store timestamp + timezone, named time
    • Cassandra uses UTC-normalized timestamp and auto-converts to timezone in some clients
    • Redis has no datatype for storing datetimes but uses unix timestamp (in seconds) in commands like EXPIREAT

    Conclusions:

    1. Except in PostgreSQL and MySQL every database has a single datetime type (most of them don't have separate date or time, though)
    2. Whereas PostgreSQL and MySQL looks like were trying to fix a mistake, that's why they added a type, not because two types needed
    3. Timestamp means different things in different databases so it's okay to give it our own meaning (i.e. rename datetime -> timestamp)

    Update: added cassandra

  • Trying to update listen_addresses triggers ISE: address already in use

    Trying to update listen_addresses triggers ISE: address already in use

    • EdgeDB Version: 1.0-beta.2+ga7130d5c7.cv202104290000
    • OS Version: Ubuntu 20.04

    Steps to Reproduce:

    1. create server instance
    2. edgedb -I SERVER_NAME configure set listen_addresses 127.0.0.1 ::1 0.0.0.0
    
    edgedb error: ERROR: InternalServerError: [Errno 98] error while attempting to bind on address ('0.0.0.0', 10700): address already in use
      Hint: This is most likely a bug in EdgeDB. Please consider opening an issue ticket at https://github.com/edgedb/edgedb/issues/new?template=bug_report.md
      Server traceback:
          Traceback (most recent call last):
            File "edb/server/protocol/binary.pyx", line 1694, in edb.server.protocol.binary.EdgeConnection.main
            File "edb/server/protocol/binary.pyx", line 950, in simple_query
            File "edb/server/protocol/binary.pyx", line 1045, in _simple_query
            File "edb/server/protocol/binary.pyx", line 1002, in edb.server.protocol.binary.EdgeConnection._simple_query
            File "edb/server/protocol/binary.pyx", line 1405, in _execute_system_config
            File "edb/server/dbview/dbview.pyx", line 515, in apply_config_ops
            File "edb/server/dbview/dbview.pyx", line 649, in apply_system_config_op
            File "/usr/lib/x86_64-linux-gnu/edgedb-server-1-beta2/lib/python3.9/site-packages/edb/server/server.py", line 629, in _on_system_config_set
              await self._restart_servers_new_addr(value, self._listen_port)
            File "/usr/lib/x86_64-linux-gnu/edgedb-server-1-beta2/lib/python3.9/site-packages/edb/server/server.py", line 573, in _restart_servers_new_addr
              self._servers, _ = await self._start_servers(nethost, netport)
            File "/usr/lib/x86_64-linux-gnu/edgedb-server-1-beta2/lib/python3.9/site-packages/edb/server/server.py", line 786, in _start_servers
              tcp_srv = await self._loop.create_server(
            File "uvloop/loop.pyx", line 1768, in create_server
          OSError: [Errno 98] error while attempting to bind on address ('0.0.0.0', 10700): address already in use
    

    Despite the error, value is changed in config:

    edgedb> SELECT cfg::DatabaseConfig.listen_addresses;
    {'0.0.0.0', '127.0.0.1', '::1'}
    
  • __subject__ does not work inside access policies

    __subject__ does not work inside access policies

    • EdgeDB Version: 2.7+5b16920
    • EdgeDB CLI Version: 2.2.6+af56767
    • OS Version: Ubuntu 22.04

    Steps to Reproduce:

    1. Declare a User object type, with no properties or links.
    2. Declare a computed global current_user that resolves to a User object instance.
    3. On the User type, create an allow all access policy such that __subject__ ?= global current_user.

    This will fail with there is no range var for [email protected]@(default::Redacted) value in <pg.SelectStmt at 0xDEADBEEF>.

    Schema:

    using future nonrecursive_access_policies;
    
    module default {
        global viewer_id -> uuid;
        global viewer := (select User filter .id = global viewer_id);
        
        type User {
            required property name -> str;
    
            access policy allow_self
                allow all
                using (__subject__ ?= global viewer);
        }
    }
    
  • Make the IR pickleable

    Make the IR pickleable

    This is in furtherance of EXPLAIN, where we want to be able to hang onto the IR to do some analysis of it.

    This means making ScopeTreeNode pickleable by omitting and then recomputing the _parent weakref and dropping the unused source_map field which contained some dodgy stuff.

  • Track the origin of schemas stored in an expression

    Track the origin of schemas stored in an expression

    Track what object/field an expression is stored in and propagate this information into context fields. It then should be possible to use this in error reporting and explain.

    We store the id and the field name; we store id instead of name because once these values are reported to a client, id is more useful: the client can't query based on the internal name.

  • Add an expand_inhviews debug flag that forces views to get expanded

    Add an expand_inhviews debug flag that forces views to get expanded

    This helps lay the groundwork for EXPLAIN: when referring to views (or non-materialized CTEs), postgres's EXPLAIN output does not refer to the alias names in our compiled query. We need it to use those alias names, though, in order to correspond query plan nodes with parts of our query.

Pytorch implementation of "A simple neural network module for relational reasoning" (Relational Networks)
Pytorch implementation of

Pytorch implementation of Relational Networks - A simple neural network module for relational reasoning Implemented & tested on Sort-of-CLEVR task. So

Nov 20, 2022
NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR2021)
NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR2021)

NExT-QA We reproduce some SOTA VideoQA methods to provide benchmark results for our NExT-QA dataset accepted to CVPR2021 (with 1 'Strong Accept' and 2

Nov 24, 2022
NExT-Ford-aula4 - NExT Ford aula4
NExT-Ford-aula4 - NExT Ford aula4

Questão 1: vocês deveram fazer o passo a passo de como ficará as pilhas(Stack) e

Jan 6, 2022
A Relational Database Management System for a miniature version of Twitter written in MySQL with CLI in python.

Mini-Twitter-Database This was done as a database design course project at Amirkabir university of technology. This is a relational database managemen

Nov 23, 2022
A next generation HTTP client for Python. 🦋
A next generation HTTP client for Python. 🦋

HTTPX - A next-generation HTTP client for Python. HTTPX is a fully featured HTTP client for Python 3, which provides sync and async APIs, and support

Dec 4, 2022
Detectron2 is FAIR's next-generation platform for object detection and segmentation.
Detectron2 is FAIR's next-generation platform for object detection and segmentation.

Detectron2 is Facebook AI Research's next generation software system that implements state-of-the-art object detection algorithms. It is a ground-up r

Nov 28, 2022
To be a next-generation DL-based phenotype prediction from genome mutations.

Sequence -----------+--> 3D_structure --> 3D_module --+ +--> ? | |

Jan 11, 2022
The next generation Canto RSS daemon

Canto Daemon This is the RSS backend for Canto clients. Canto-curses is the default client at: http://github.com/themoken/canto-curses Requirements De

Nov 2, 2022
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark

Introduction English | 简体中文 MMAction2 is an open-source toolbox for video understanding based on PyTorch. It is a part of the OpenMMLab project. The m

Dec 3, 2022
Raganarok X: Next Generation Data Dump

Raganarok X Data Dump Raganarok X: Next Generation Data Dump More interesting Files File Name Contains en_langs All the variables you need in English

Jul 15, 2022
Code for the paper "Next Generation Reservoir Computing"

Next Generation Reservoir Computing This is the code for the results and figures in our paper "Next Generation Reservoir Computing". They are written

Nov 9, 2022
Next-generation of the non-destructive, node-based 2D image graphics editor
Next-generation of the non-destructive, node-based 2D image graphics editor

Non-destructive, node-based 2D image graphics editor written in Python, focused on simplicity, speed, elegance, and usability

Nov 26, 2022
spade is the next-generation networking command line tool.
spade is the next-generation networking command line tool.

spade is the next-generation networking command line tool. Say goodbye to the likes of dig, ping and traceroute with more accessible, more informative and prettier output.

Jan 28, 2022
Hopefully the the next-generation backend server of bgm.tv

Hopefully the the next-generation backend server of bgm.tv

Nov 27, 2022
A next-generation CLI and TUI that aims to be your personal assistant for everything competitive programming related. 🚀

Competitive Programming Tool Kit The Competitive Programming Tool Kit (cptk for short), is a command line and terminal user interface (CLI and TUI) th

May 21, 2022
A Next Generation ConvNet by FaceBookResearch Implementation in PyTorch(Original) and TensorFlow.

ConvNeXt A Next Generation ConvNet by FaceBookResearch Implementation in PyTorch(Original) and TensorFlow. A FacebookResearch Implementation on A Conv

Feb 14, 2022
Pony Object Relational Mapper

Downloads Pony Object-Relational Mapper Pony is an advanced object-relational mapper. The most interesting feature of Pony is its ability to write que

Nov 27, 2022
Temporal-Relational CrossTransformers

Temporal-Relational Cross-Transformers (TRX) This repo contains code for the method introduced in the paper: Temporal-Relational CrossTransformers for

Nov 15, 2022