Language Design: Use [ ] instead of < > for Generics

hn_throwaway_99 · on April 5, 2020

Yes, let's move away from the standard syntax used to specify generics, which is used in virtually every widely-used programming language of the past 15-20 years I can think of, and instead redefine the most common syntax used for array indexing (oh, and make array indexing look like a function call).

This fixating on relatively inconsequential issues, while totally missing the forest for the trees, is a dangerous attribute I've found in a small subset of programmers.

AnimalMuppet · on April 5, 2020

I would say it's in a small subset of people who think like computer scientists rather than like software engineers. Computer scientists think about better syntax for computer languages. (I don't think that this is actually a better syntax, especially given the already-established conventions. But thinking about array indexing as a function call is, in my view, an interesting take.)

Software engineers, on the other hand, worry about how to efficiently write non-trivial programs. For that problem, this change doesn't move the needle whatsoever.

Most non-junior programmers think more like software engineers than like computer scientists. And even many who think like computer scientists can see the forest, not just the trees.

jsjolen · on April 5, 2020

I've never seen a computer scientist put a lot of effort into syntax. The POPL people are more interested in type systems and semantics than they are in syntax.

ledauphin · on April 5, 2020

This article completely loses me at the end where it suggests that using square brackets for assignment and collection access is an unnecessary/dangerous convenience.

It might not be the author's preference, but the argument made about it getting abandoned as the language ages anyway is demonstrably false with the most popular languages in the world (Python, JavaScript, Java, C, C++), and in languages with operator overloading it could never be true no matter how far the language evolved.

aloknnikhil · on April 5, 2020

Exactly. Array indexing with [] feels ok to me.

> someList(0) /* instead of */ someList[0]

This is worse. How do you disambiguate that from a function call? It's now just abusing ()

beeforpork · on April 5, 2020

> How do you disambiguate that from a function call? It's now just abusing ()

You don't. If you apply an integer to an array, it will return an entry. To me, it makes perfect sense to use a(i) instead of a[i]. In fact, in case you need some more complicated data structure to organise your data one day, you could switch to a function call for lookup without rewriting the code that uses a(i).

The question is not 'how do you disambiguate?', but 'why do you want to distinguish?' Because an array is just a special case of a function that maps an integer to something else. It is logically not different from a function with a large contiguous switch statement.

username90 · on April 6, 2020

> To me, it makes perfect sense to use a(i) instead of a[i].

I disagree. Square brackets links to memory, paren just returns data. Of course you could implement the paren function to return a reference, but it makes the code a lot harder to read since that is not the normal case. See comparison:

    a[i] = 3;

or

    a(i) = 3;

Someone · on April 5, 2020

Some language designers would argue that, conceptually, it is a function call, especially in languages that don’t allow mutations to data structures.

  foo = [10,20]

  Func foo(i) = switch(i)
    case 0: return 10
    case 1: return 20
    default: return nil

beeforpork · on April 5, 2020

That. Exactly.

monocasa · on April 5, 2020

If it is a list, access like that is a function. Even in the context where it's an array with the depedence we have these days on function inlining, it could still be a function

  fn someArray(offset: usize) -> T* {
    self.base + (offset * T.size) as T*
  }

ben509 · on April 5, 2020

Another way to make that argument is that arrays are an exponential type and therefore are logically equivalent to functions.

That is, the cardinality of the type Array<T> is exactly the same as the cardinality of the type Function<Integer, T>. Any pure function that takes an integer and returns T can be replaced with an array, and vice versa.

Same deal with Map<K, V>, that's logically equivalent to Function<K, V>.

Two problems:

1. People don't think in category theory, they have containers to contain thinks and functions to calculate things.

2. People are right, the function call really is doing something different than an array index. If a thing is different, it should look different.

username90 · on April 6, 2020

You are wrong, collections are mutable and hence different from functions. Index operators are expected to return l-values while functions are not, that difference is extremely important so is worth the compiler overhead in order to make code easier for users to read.

ben509 · on April 6, 2020

You can't analyze this stuff with the specifics of a language in mind, or even assume there's a heap, let alone talk about what compiler optimizations apply. Because if we try to do that, what's our common basis of understanding? You could be talking about gcc and I'd be talking about clang and we'd go in circles.

If we limit ourselves to the math, we can agree that a type is a domain that contains certain values and excludes others. But we have to throw out mutability because values, by mathematical definition, are immutable.

And once we've got a clear notion of what values are, we can then talk about how many values there are in a type. And that's where I'm coming from in claiming functions are equivalent to containers.

monocasa · on April 5, 2020

> People are right, the function call really is doing something different than an array index.

What is fundamentally different?

monocasa · on April 6, 2020

I mean, really? Isn't the whole point of the STL to paper over those differences and supply functions with similar semantics for those kind of accesses regardless of the data structure?

About the only reason I can see for array access syntax existing is that in the days before optimizing compilers, people would have lost their mind if array access called into a function each time (with good reason). It took manual programmer intervention to hint what compilers couldn't cleanly infer themselves. We don't need that anymore.

ben509 · on April 6, 2020

One is looking up a value, while the other is (in most of these languages) executing a subroutine.

While they are mathematically equivalent, in these languages they are doing different things. Function application and array access are two different concepts the symbols are trying to convey to a programmer.

This is why, e.g., in Haskell where a string is literally a list of characters, they nevertheless have a different syntax for the two.

monocasa · on April 6, 2020

I mean, Haskell literally defines array access in terms of a function.

> Haskell provides indexable arrays, which may be thought of as functions whose domains are isomorphic to contiguous subsets of the integers.

https://www.haskell.org/onlinereport/haskell2010/haskellch14...

The fact that strings are different seems to have more to do with the ergonomics of strings rather than arrays.

ben509 · on April 5, 2020

I think .get would be more plausible. It's just strange that we can't use [] for indexing, which is an operation that can apply to any container, but << and >> must be reserved for bit-shifting.

If I'm adding shifting to a new language, I'm inclined to use functions just so it's clear what's going on, and to put operations like rotation on an even footing. Also, I don't think people have an intuition for the precedence of shift operators, which makes them less useful.

cwzwarich · on April 5, 2020

OCaml uses .() for array access.

aloknnikhil · on April 5, 2020

Right. The difference being the "."

The article proposes using basic function call syntax. That makes no sense to me. And with .(), you now have 3 characters to type instead of 2 with []. Obviously all of this is pedantic, but then again so is the article.

jsjolen · on April 5, 2020

All of what the author said is what Scala does (author likes Scala).

I've written a decent amount of Scala, it's fine.

I've also written a lot of Common Lisp, where array access is done using aref. That was also fine.

unlinked_dll · on April 5, 2020

I think the tone of this post is insufferable, and the arguments have little merit presented this way.

The alternative suggested is also non ideal, since array declaration/indexing is semantically different than a function call, not all languages have method calls, not all languages have postfix method calls, I can think of a few ways to break that syntax (what if the type has a call operator and indexing operator?), and it is hypocritical.

The problem with <> is that they are used elsewhere. If you use [] by sacrificing arrays, () will cause problems because they're used elsewhere.

The lowest friction solution would be to introduce a new two character bracket. How about <: or :>? (: :)? I don't know but writing about it in that tone won't get anyone to do something different.

WalterBright · on April 5, 2020

> Use [] instead of <> for generics. It will save you a lot of avoidable trouble down the road.

D uses !() for generics. It looks odd at first, but soon becomes completely natural:

    struct S(T) { T t; }

    auto a = S!(int);
    auto b = S!int;   <= when only one argument is needed

Since ! is not a conventional binary operator, this:

1. parses without lookahead and ambiguity issues

2. stands out in the code as being a template instantiation

Well over a decade of experience with it (and D code typically makes heavy use of templates) with no issues shows that it works.

imtringued · on April 6, 2020

From a computer science standpoint it is obvious that generic declarations are just functions that accept type parameters and return a declaration according to the type parameters. So the only reasonable suggestion would be to use the same syntax for both type parameters and normal parameters and optionally add something that distinguishes between the two (like your !).

Meanwhile picking either <> or [] feels like that decision is based on personal preference. Obviously, you can justify choosing the most popular syntax because of familiarity but then [] was never an option to begin with.

WalterBright · on April 6, 2020

> From a computer science standpoint it is obvious that generic declarations are just functions that accept type parameters

I'm ashamed that realization was slow in coming to me. But once it did, it was key to greatly simplifying the D template syntax.

chrismorgan · on April 5, 2020

I tried (lightly) to convince people to switch from <> to [] well before Rust 1.0, but even when many (not sure if it was most) agreed that square brackets were superior, there was no will to make such a radical syntactic change, for two main reasons:

1. Angle brackets are also not uniformly superior. Technically, I think it’s fair to consider them uniformly superior, but social aspects are important too; there was reasonable concern that Rust had exhausted its weirdness budget. (Of the most common languages people may be familiar with when coming to Rust, I think Scala is the only one that uses square brackets for generics; the likes of C++, C and Java are much more common and use angle brackets.)

2. For Rust specifically, it wouldn’t actually have been just a syntactic change—if you want to change array indexing to use function call syntax, you’ve got to sort out more technical challenges there, things that would be approaching trivial in full-GC languages but which are actually rather difficult in Rust. Specifically, function calls are rvalues only (`… = f()`), but array indexing can be an lvalue also (`a[i] = …`). So you need to more or less unify the Index and Fn trait families, auto-ref/-deref might cause trouble, and placement might make an appearance in it too for best results (and that’s something that still isn’t resolved). Now I’m inclined to believe that these changes would have been a really good thing and resulted in a more coherent and incidentally slightly more flexible language (and not in a dangerous way), but it would now be even more difficult to achieve (though not impossible; the edition system could be used for the syntax side of things, and Fn/Index implementations are still mutually exclusive, because you can’t implement Fn manually on stable, so you could blanket-impl Fn traits for types implementing Index traits without breaking backwards-compatibility).

I’m simplifying the story a bit. Others may fill out more history and technical detail if they desire. I’m going to bed.

steveklabnik · on April 5, 2020

Rust did have [] at some point. It was switched well before 1.0.

chrismorgan · on April 6, 2020

It was switched from [] to <> before 0.1, i.e. unfathomably ancient history.

More information on this and other Rust-related stuff, from a couple of years ago (thus, well after Rust 1.0): https://old.reddit.com/r/rust/comments/6l9mpe/minor_rant_i_w...

fastball · on April 5, 2020

Can't say I agree with most of the last paragraph. I much prefer to have my function calls easily differentiated from my array indexing / hashmap keying.

Though I do agree that array/object literals using [] and {} are probably not necessary.

F-0X · on April 5, 2020

My personal anecdatal arguement against "Hard to parse for humans" is... I actually really like generics residing within <>. It's actually harder for me to read generics denoted otherwise (D uses parentheses)

As for the last paragraph, I'd implore language implementors to continue using < > lest we get that awfully ambigous nonsense.

seanmcdirmid · on April 5, 2020

Scala uses brackets for generics and parens for array accesses. It works nicely, I think.

The primary argument against < and > as generic brackets is that the ambiguity can make for some confusing error messages. It also prevents you from pre-matching your braces before the parsing phase (a technique that enhances error recovery).

dhosek · on April 5, 2020

Maybe it's my bias of having been introduced to parameterized types via C++'s template mechanism, but I find that <> looks natural to me and Scala's use of square brackets feels off. In general, I find myself preferring C-derived syntax.

karlicoss · on April 5, 2020

I find it extremely ironic that Hackernews has swallowed angle brackets from the title (the article rants about ambiguity during parsing)

Animats · on April 5, 2020

Yes. You can't use HTML in postings here, but the site still eats angle brackets.

Let's see if this will work.

<

Nah.

detaro · on April 5, 2020

<>

Some of the modifications done to titles are indeed weird and contraproductive IMHO.

dang · on April 5, 2020

I managed to get them back in there with a space in between.

Input sanitizing and escaping is one aspect of the HN code I've never dug too much into. There are a lot of corner cases that don't work, but it's never been a high enough priority to fix them.

Analemma_ · on April 5, 2020

If this had been published in, like, 1982, maybe the author would've had a point. But by now we're all so used to <> for generics, especially when scanning code rapidly, that I can't imagine the meager benefits of switching would outweigh the cognitive cost.

troughway · on April 5, 2020

I don’t think in 1982 they had issues with it - most problems that this article purports angle brackets to have are problems we have recently invented.

As for HN not being able to handle having them as characters in titles, one would hope that would be a “simple” matter of escaping the characters at time of submission and parsing them at time of rendering.

jackyinger · on April 5, 2020

Try using verilog where <= is both the non-blocklocking assignment operator and the less-than-or-equal operator. At first it might sound/look weird, but you can get used to it easily.

It’s all about the context; I think it would actually be more confusing to go with the author’s suggestion.

Furthermore, I can’t think of a place in code where the generic and comparison use cases of < or > would be ambiguous or even adjacent.

cwzwarich · on April 5, 2020

> Furthermore, I can’t think of a place in code where the generic and comparison use cases of < or > would be ambiguous or even adjacent.

Most languages that use angle brackets for generics (e.g. C++, Rust, Swift) also allow you to explicitly instantiate generics, which leads to a syntactic ambiguity, e.g. is

    a < b > (c)

a function call or two comparisons? The usual way to disambiguate in these cases is to check whether b is plausibly a type and treat it as a generic, but this is ugly on a few different levels.

unlinked_dll · on April 5, 2020

Rust uses the "turbo fish" to get around this. Your example wouldn't compile unless a and b where valid variable names.

    a::<b> (c)

Is how you instantiate a generic. It's not that ugly. Bigger problem is value type generics/const generics where you want to do some logic like

     a<b > c>

cwzwarich · on April 5, 2020

> Bigger problem is value type generics/const generics

How does Rust solve the ambiguities with const generics?

monocasa · on April 5, 2020

The ambiguous const generics have to be surrounded by { }:

  fn function<T, { ambigous const generic expression }>(arg1: T) -> T {

unlinked_dll · on April 5, 2020

I believe the current proposal (partially supported on nightly) is to require expressions in generic arguments to be enclosed in {}

chrismorgan · on April 5, 2020

The expression must be wrapped in curly braces: Type<{ expression }>

seanmcdirmid · on April 5, 2020

> The usual way to disambiguate in these cases is to check whether b is plausibly a type and treat it as a generic

Most compilers don't even go that far: using type information parsing is a huge no no for almost all languages that aren't C++. So for example, Typescript will flag:

    a<b>(c)

as an error if a, b, and c are untyped, meaning it automatically resolved the generic application in the parser before type information was processed, while this is fine:

    a<b>=(c)

Since the equal sign means > is part of a >= token. The parser is automatically determining if generic application was meant or not, before any type info is considered.

cwzwarich · on April 5, 2020

That's why I said "plausibly". From the TypeScript spec, it looks like it takes the Boolean grammar approach (treat it like a type expression if it's a valid type expression, otherwise don't):

The grammar ambiguity is resolved as follows: In a context where one possible interpretation of a sequence of tokens is an Arguments production, if the initial sequence of tokens forms a syntactically correct TypeArguments production and is followed by a '(' token, then the sequence of tokens is processed an Arguments production, and any other possible interpretation is discarded. Otherwise, the sequence of tokens is not considered an Arguments production.

seanmcdirmid · on April 5, 2020

Right...the big weakness of this approach are the error messages, which are fairly obtuse if you ever happen to run into this corner case. Luckily, most programmers won't.

owl57 · on April 5, 2020

If you need compatibility with C, JS or whatever, then you can't use square brackets for generics anyway. If you don't need such compatibility, why can't you just make that a function call and tell users to avoid funny combinations of comparisons and parens?

u801e · on April 5, 2020

It looks like the < and > characters need to be escaped because they're missing from the story title when viewed on HN.

mkchoi212 · on April 5, 2020

One more reason why they shouldn’t be used for generics??

AnimalMuppet · on April 5, 2020

When I write code, whether I can easily make an HN headline out of the code is nowhere on my list of concerns.

chrismorgan · on April 6, 2020

But when I write articles, how it will be handled by various platforms is on my list of concerns. So many platforms damage things with & < > ' " in them, stripping angle brackets and maybe their contents, and possibly entity-encoding all of those characters. (I find that curly quotes are harmed less often than straight quotes these days!)

When I wrote https://chrismorgan.info/blog/rust-artwork-owl/, which has the title “<_>::v::<_>”, I thought carefully about how it would be handled by feed readers, &c. Fortunately I had already been very careful in the site implementation (e.g. I support HTML in my titles and can strip it or have a different plaintext title, so angle brackets were definitely handled correctly) and had made the deliberate decision that my feed was Atom and uses <title type="html">, so feed readers should all get it right (though doubtless some will ignore the declared semantics and butcher it); if it had been an RSS feed (which doesn’t support declaring the type of a title), some feed readers would have stripped it to “::v::” (and been justified in so doing). Reddit handled it just fine, but maybe it’s just as well I didn’t submit it to HN!

praptak · on April 5, 2020

The bracket ambiguity in C++ is much worse than the article describes.

    Foo < a , b > x;

This can be either a template installation or a sequence of comparisons.

tasogare · on April 5, 2020

That’s the most ridiculous article I read this week. At this point the fact I can’t input < and > with my virtual Azerty on physical Japanese keyboard could make in it to make it more substantial.

leshow · on April 5, 2020

IMO haskells way of defining type parameters with forall is the best option. It doesn't fit with C-style function definitions though.

It's kind of a stupid thing to rant over. Whether a language uses <> or [] is the least interesting thing about it, but that doesn't stop everyone from bikeshedding.

stefanos82 · on April 5, 2020

D language lets you define your template(s) in the following format:

    auto add(T)(T lhs, T rhs) {
        return lhs + rhs;
    }

and to instantiate it you do the following:

    add!int(5, 10);
    add!float(5.0f, 10.0f);
    // won't compile; Animal doesn't implement +
    add!Animal(dog, cat);

Personally, as a C++ adventurist I find the instantiating syntax a bit confusing, but quite tolerable.

I wonder whether anyone attempted to use "|T|" as their template syntax...Ruby uses it in iterators (I think) and if I'm not mistaken Rust uses it with closures.

Wouldn't be a bit cleaner to have something like the following?

    |T| something (T foo) -> T {
      // do something in here
      return foo;
    }

Personally I prefer it.

Gaelan · on April 5, 2020

> Ruby uses it in iterators (I think) and if I'm not mistaken Rust uses it with closures.

Ruby and Rust both use them for closures (I assume Rust took them from Ruby). Nearly all of Ruby's iteration constructs are based on closures, so it makes sense that you understood them as for iterators.

ken · on April 5, 2020

I agree with some parts of the problem (those < > chars already have other uses), but not others (guillemets are even shorter and they read just fine). I don't care for the proposed solution: it's just robbing Peter to pay Paul.

Now people are proposing digraphs and trigraphs as alternatives. Have we learned nothing from C?

I'm annoyed that (even in 2020) every programming language restricts its syntax to ASCII characters (1967) which can be typed on an IBM Model F keyboard (1981). Unicode has dozens of unique styles of brackets. Everybody's using an OS that supports Unicode, and an editor/IDE that autocompletes most of their source code anyway. We're even using programming languages which allow Unicode identifiers, so ASCII-only viewers have been in trouble for 25 years already. People are even using emoji in commit messages. That ship has sailed. Unicode is safe to use.

It could be List⟦Int⟧ or List⟬Int⟭ or List｢Int｣ or dozens of others. They're big, they're clear, they're easy to parse (one char, no other uses). All you need to do is pick one and make it part of the language, and update a few editor modes to support it in some templates.

ledauphin · on April 5, 2020

the question for _programming_ is not solely about readability, it's also about writability.

imagine being a beginner programmer and not even being able to find the character you need to make your program work without learning about Unicode...

bmn__ · on April 5, 2020

Just allow both. https://docs.raku.org/language/unicode_ascii

gurkendoktor · on April 5, 2020

It's not going to happen, but my favorite solution to the parsing ambiguities would be to parse < as a comparison if there is whitespace around it, and as a template bracket otherwise. Kind of how "puts -1" in Ruby prints -1, and "puts - 1" tries to subtract 1 from the result of puts.

quink · on April 5, 2020

Fun fact:

    Array(1, 2, 3)
    someList(0)
    array(0) = 23.42
    map("name") = "Joe"

All of this is valid MUMPS syntax and does what you'd think.

I'd loooove to hear the author's thoughts on the above then extending weird parentheses syntax crap to function calls, MUMPS style too:

    USER>s foo(1) = "hi,there"
    USER>s $p(foo(1), ",", 2) = "you"
    USER>w foo(1)
    hi,you
    USER>

Maybe I'm warning of a slippery slope here, but this is a direction that I'm not sure you'll want to go down.

Also a fun fact, MUMPS has very dynamic typing and thus very much no type annotations.

Someone · on April 5, 2020

I think that describes Scala:

- https://docs.scala-lang.org/overviews/collections/arrays.htm...

- https://docs.scala-lang.org/tour/generic-classes.html

Unfortunately, scala caused parsing problems by allowing alphabetic function names as operators, grabbing implicit arguments from all over the place, and leaving out () in function calls (I think they reversed the latter decision)

kalekold · on April 5, 2020

I personally like the way D handles it.

j88439h84 · on April 5, 2020

Which is

WalterBright · on April 5, 2020

https://dlang.org/spec/template.html

FpUser · on April 5, 2020

Article about a complete non issue.

nyanpasu64 · on April 5, 2020

Python's typing package (mypy) uses indexing or square brackets for generics. Not sure if I find it confusing in complex situations.

978e4721a · on April 5, 2020

For God sake generics is just a function over types. Use () like Haskell do.

travisgriggs · on April 5, 2020

Why does it have to be an enclosing pair of infix characters?

ben509 · on April 5, 2020

Generally, I think Type<Param, Param> is like a function call, you are applying two parameters to a generic type to get a concrete type.

But, you could curry generic parameter application, and then have a simple operator, so it'd be Type ! Param ! Param. Thus:

    Map ! Str ! Int a = 5

Not sure I like that.